anvaya prep

LSAT · Logical Reasoning · Causation and Explanation

High YieldMedium20 min read

Explaining statistics

A complete LSAT guide to Explaining statistics — covering key concepts, exam-focused explanations, and high-yield FAQs.

Overview

Explaining statistics is a critical reasoning pattern that appears frequently on the LSAT Logical Reasoning section. This question type presents statistical data—such as survey results, demographic trends, or comparative percentages—and asks test-takers to identify which answer choice best accounts for an apparent discrepancy, surprising result, or counterintuitive finding within that data. Unlike pure causation and explanation questions that deal with singular events or phenomena, explaining statistics questions require understanding how aggregate data can mask underlying patterns, how sampling methods affect conclusions, and how multiple factors can produce seemingly paradoxical numerical outcomes.

The LSAT tests this skill because legal reasoning constantly involves interpreting statistical evidence, understanding survey methodology, and recognizing when numerical data supports or undermines an argument. Attorneys must evaluate statistical claims in cases involving discrimination, product liability, environmental harm, and countless other areas where quantitative evidence plays a decisive role. The ability to explain why statistics appear the way they do—rather than simply accepting them at face value—represents sophisticated logical reasoning that distinguishes strong analytical thinkers from those who merely process surface-level information.

Within the broader framework of Logical Reasoning, explaining statistics questions bridge multiple conceptual areas. They connect to causation by requiring analysis of what factors produce observed outcomes, to assumption identification by revealing unstated premises about data collection and interpretation, and to strengthen/weaken questions by testing understanding of what additional information would make statistical patterns more or less surprising. Mastering this topic enhances performance across numerous question types while building the analytical foundation necessary for success on the LSAT and in legal practice.

Learning Objectives

  • [ ] Identify how Explaining statistics appears in LSAT questions
  • [ ] Explain the reasoning pattern behind Explaining statistics
  • [ ] Apply Explaining statistics to solve LSAT-style problems accurately
  • [ ] Distinguish between explanations that resolve statistical discrepancies and those that merely restate the data
  • [ ] Recognize common statistical patterns including base rate effects, sampling bias, and compositional changes
  • [ ] Evaluate multiple potential explanations and select the one that most directly addresses the statistical phenomenon
  • [ ] Identify when additional unstated factors account for counterintuitive statistical results

Prerequisites

  • Basic statistical literacy: Understanding percentages, rates, proportions, and comparative data is essential for interpreting the numerical information presented in stimulus passages.
  • Causal reasoning fundamentals: Recognizing the difference between correlation and causation helps distinguish between mere associations in data and genuine explanatory relationships.
  • Argument structure analysis: The ability to identify conclusions, premises, and gaps in reasoning enables recognition of what aspect of the statistics requires explanation.
  • Comparative reasoning: Understanding how to compare groups, time periods, or categories is necessary for identifying what makes statistical findings surprising or counterintuitive.

Why This Topic Matters

Explaining statistics questions appear with remarkable consistency on the LSAT, typically comprising 2-4 questions per exam across the Logical Reasoning sections. This represents approximately 8-15% of all Logical Reasoning questions, making it one of the highest-yield question types for focused preparation. These questions most commonly appear as "resolve the paradox" or "explain the discrepancy" question stems, though they can also surface in strengthen/weaken formats where statistical explanations serve as the mechanism for affecting argument strength.

In legal practice, attorneys regularly encounter statistical evidence in depositions, expert testimony, and documentary evidence. Understanding how to explain apparent anomalies in data prevents misinterpretation of evidence and enables effective cross-examination of opposing experts. For instance, employment discrimination cases often hinge on whether statistical disparities in hiring or promotion reflect discriminatory practices or can be explained by other factors such as applicant pool composition, qualification differences, or departmental variations.

On the LSAT specifically, explaining statistics questions appear in several characteristic formats: comparative statistics showing unexpected differences between groups, temporal trends that seem counterintuitive, survey results that contradict expectations, and aggregate data that masks important subgroup patterns. The test-makers deliberately construct these questions to reward careful analytical thinking while punishing hasty assumptions about what statistical patterns mean.

Core Concepts

The Nature of Statistical Explanations

Explaining statistics involves identifying factors that account for why numerical data appears the way it does, particularly when that appearance seems surprising, contradictory, or counterintuitive. Unlike simple description, which merely restates what the numbers show, genuine explanation reveals underlying mechanisms, hidden variables, or methodological factors that make the statistical pattern comprehensible and expected rather than puzzling.

The key to understanding this concept lies in recognizing that statistics represent aggregated information, and aggregation can obscure important details. When raw numbers are combined into percentages, averages, or totals, information about subgroups, distributions, and compositional changes often disappears. A successful explanation typically reveals one of these hidden factors, showing that what appears paradoxical at the aggregate level makes perfect sense when the underlying structure is understood.

Base Rate and Denominator Effects

One of the most powerful explanatory patterns involves base rate effects—situations where the size of the underlying population dramatically affects the interpretation of percentages or rates. A small percentage of a large group can exceed a large percentage of a small group in absolute terms. Similarly, the same absolute change can represent vastly different percentage changes depending on the starting denominator.

Consider this pattern: City A shows a 50% increase in a particular crime while City B shows only a 10% increase, yet City B's crime problem worsened more severely. The explanation might be that City A started with only 10 incidents (increasing to 15), while City B started with 1,000 incidents (increasing to 1,100). The percentage change misleads because the base rates differ dramatically.

LSAT questions frequently exploit this pattern by presenting percentage changes without absolute numbers, or by comparing rates across populations of different sizes. The correct explanation often reveals that the denominator—the base population from which percentages are calculated—differs in a way that makes the statistical pattern unsurprising.

Sampling Bias and Selection Effects

Sampling bias occurs when the group from which data is collected differs systematically from the population about which conclusions are drawn. This creates statistical patterns that reflect the peculiarities of the sample rather than genuine features of the broader population. Selection effects represent a related phenomenon where the process of choosing who or what to measure introduces systematic distortions.

For example, a survey might show that restaurant customers report high satisfaction levels, yet the restaurant struggles financially. The explanation could be that dissatisfied customers simply stop coming rather than complaining, so the survey only captures the self-selected group of people satisfied enough to return. The statistical pattern (high satisfaction) is genuine but misleading because the sample (current customers) excludes the relevant comparison group (former customers who left).

LSAT questions test this concept by presenting data from sources that might not represent the relevant population, such as voluntary surveys, convenience samples, or groups defined by characteristics that correlate with the measured outcome. The correct explanation often identifies how the sampling method produces the observed statistical pattern.

Compositional Changes and Simpson's Paradox

Compositional changes occur when the makeup of a group shifts over time or differs between comparison groups in ways that affect aggregate statistics. This can produce Simpson's Paradox, where a trend appears in aggregate data but reverses within every subgroup, or where subgroup patterns disappear when data is combined.

A classic example: A company's average salary increases, yet every individual employee's salary decreases. The explanation might be that the company laid off low-wage workers, so the remaining workforce has a higher average salary despite individual pay cuts. The composition of "company employees" changed in a way that affected the aggregate statistic.

On the LSAT, compositional explanations often resolve apparent paradoxes by revealing that:

  • The mix of subgroups changed between time periods
  • Comparison groups differ in their internal composition
  • New members entering a population differ from existing members
  • Departures from a population are non-random

Multiple Factor Interactions

Many statistical patterns result from the interaction of multiple factors, where the combined effect differs from what either factor would produce alone. Understanding these interactions requires recognizing that statistical outcomes often have multiple contributing causes, and the presence of one factor can amplify, diminish, or reverse the effect of another.

For instance, a medication might show better outcomes in clinical trials than in real-world use. The explanation could involve multiple factors: trial participants were healthier than typical patients, received more careful monitoring, and had better medication adherence due to the study protocol. No single factor fully explains the discrepancy; rather, their combination accounts for the statistical difference.

LSAT questions testing this concept present situations where a single-factor explanation seems insufficient, and the correct answer identifies how multiple elements work together to produce the observed pattern. These questions reward test-takers who avoid oversimplified explanations and recognize when complex phenomena require multifaceted accounts.

Temporal Lag and Delayed Effects

Statistical patterns often reflect temporal lag—the delay between when a cause operates and when its effects appear in measured data. This can make current statistics seem disconnected from current conditions because the data actually reflects past circumstances.

For example, a city implements new educational policies, yet test scores decline the following year. The explanation might be that test scores reflect the education students received in previous years before the new policies took effect. The statistical pattern (declining scores) doesn't contradict the policy's effectiveness because the temporal relationship is misunderstood.

On the LSAT, temporal lag explanations resolve apparent contradictions by showing that:

  • Current statistics reflect past conditions
  • Recent changes haven't yet affected measured outcomes
  • Long-term trends differ from short-term fluctuations
  • The timing of measurement affects what patterns appear

Concept Relationships

The core concepts within explaining statistics form an interconnected web of analytical tools. Base rate effects and sampling bias both involve understanding how the denominator—the population from which statistics are drawn—affects interpretation, but base rate effects focus on population size while sampling bias focuses on population characteristics. Compositional changes can be understood as a temporal form of sampling bias, where the "sample" (the group at one time point) differs from the "sample" (the group at another time point) in systematic ways.

Multiple factor interactions serve as an umbrella concept that can incorporate any of the other patterns. A statistical discrepancy might result from base rate effects combined with compositional changes, or from sampling bias interacting with temporal lag. The most complex LSAT questions require recognizing that several explanatory mechanisms operate simultaneously.

These concepts connect to prerequisite knowledge of causation and explanation by applying causal reasoning to aggregate data rather than individual events. They extend basic causal analysis by adding the complexity of statistical aggregation, where the relationship between cause and effect becomes obscured by how data is collected, combined, and presented. The relationship map flows as follows:

Basic causal reasoning → Applied to aggregate data → Requires understanding of statistical aggregation → Reveals hidden factors through base rates, sampling, composition, interactions, and temporal patterns → Produces complete explanation of statistical phenomena

Quick check — test yourself on Explaining statistics so far.

Try Flashcards →

High-Yield Facts

Most explaining statistics questions involve identifying a hidden variable or factor not mentioned in the original stimulus that accounts for the surprising data.

Base rate effects are among the most common explanatory patterns—always consider whether different-sized populations could account for percentage differences.

The correct answer typically makes the statistical pattern seem expected or unsurprising rather than paradoxical.

Sampling bias explanations often identify how the measured group differs systematically from the population of interest.

Compositional changes explain how aggregate statistics can move in one direction while all subgroups move in the opposite direction.

  • Temporal lag explanations work by showing that current statistics reflect past conditions rather than present circumstances.
  • Multiple factor explanations are more likely correct when single-factor answers seem incomplete or only partially address the discrepancy.
  • The wrong answers in explaining statistics questions often restate the paradox, introduce irrelevant information, or explain the wrong aspect of the data.
  • Percentage changes can be misleading when absolute numbers are unknown—a large percentage of a small number may be less significant than a small percentage of a large number.
  • Self-selection bias occurs when the act of being measured or surveyed correlates with the characteristic being measured.
  • Survivor bias is a specific form of sampling bias where only successful or persistent cases remain in the measured population.
  • The correct explanation must address the specific discrepancy or surprising element identified in the question stem, not just provide general information about the topic.

Common Misconceptions

Misconception: Any answer that provides additional information about the situation explains the statistics. → Correction: A genuine explanation must specifically account for what makes the statistical pattern surprising or counterintuitive. Merely adding related facts doesn't constitute an explanation unless those facts resolve the apparent discrepancy.

Misconception: The correct explanation will always involve complex statistical concepts or mathematical relationships. → Correction: Many correct explanations involve straightforward logical reasoning about what factors were overlooked in the initial presentation. The LSAT tests reasoning ability, not advanced statistical knowledge.

Misconception: If statistics show correlation between two factors, the explanation must involve a causal relationship between them. → Correction: Correlation can result from a third factor causing both measured variables, from reverse causation, from coincidence, or from sampling artifacts. Explanations often reveal why correlation exists without direct causation.

Misconception: Explaining statistics questions always present paradoxes or contradictions that need resolving. → Correction: While many questions present apparent paradoxes, some simply ask for the best explanation of a statistical pattern without implying that the pattern is surprising. The question stem determines what type of explanation is required.

Misconception: The explanation that accounts for the largest portion of the statistical difference is always correct. → Correction: The LSAT asks for the explanation that best accounts for the pattern, which may be the one that makes the pattern most comprehensible rather than the one with the largest quantitative effect. Qualitative fit matters more than magnitude.

Misconception: Demographic or background differences between groups are always relevant to explaining statistical differences. → Correction: Demographic factors only explain statistical patterns when those factors plausibly connect to the measured outcome. Irrelevant demographic differences don't constitute explanations even if they distinguish the groups.

Worked Examples

Example 1: Restaurant Revenue Paradox

Stimulus: A restaurant chain implemented a new menu with higher-priced items. Six months later, average revenue per customer visit increased by 15%, yet total restaurant revenue decreased by 8%. Which of the following, if true, best explains the apparent discrepancy?

Analysis Process:

First, identify what makes this statistical pattern surprising: revenue per visit increased, yet total revenue decreased. These seem contradictory because higher revenue per visit should produce higher total revenue.

Second, recognize that total revenue equals (revenue per visit) × (number of visits). If revenue per visit increased but total revenue decreased, the number of visits must have decreased substantially—by more than enough to offset the per-visit increase.

Third, evaluate what could cause such a decrease in visits. The question asks us to explain the discrepancy, so we need a factor that connects the menu change to reduced customer traffic.

Correct Answer Pattern: "The higher prices caused a 25% reduction in customer visits, as price-sensitive customers chose competitors with lower-priced options."

Why This Works: This explanation reveals the hidden variable (number of visits) and shows how it changed in a way that produces the observed pattern. A 25% decrease in visits combined with a 15% increase in per-visit revenue yields approximately an 8% decrease in total revenue (0.75 × 1.15 ≈ 0.86). The explanation makes the apparently contradictory statistics comprehensible by revealing the trade-off between price and volume.

Wrong Answer Patterns:

  • "The new menu items cost more to prepare, reducing profit margins." (This explains profitability, not the revenue discrepancy)
  • "Customer satisfaction with food quality increased after the menu change." (This doesn't explain why total revenue decreased)
  • "The restaurant chain opened two new locations during this period." (This would increase total revenue, not decrease it)

Example 2: Educational Achievement Gap

Stimulus: In Midville, the achievement gap between high-income and low-income students narrowed significantly over the past decade, with low-income students' test scores rising faster than high-income students' scores. However, the district's overall average test score remained virtually unchanged during this period. Which of the following best explains this apparent discrepancy?

Analysis Process:

First, identify the puzzle: if one group (low-income students) improved significantly and another group (high-income students) improved at least somewhat, the overall average should increase. Why didn't it?

Second, recognize this as a potential compositional change problem. The overall average depends not just on each group's scores but also on the relative size of each group.

Third, consider what compositional change would produce this pattern. If the proportion of low-income students (who score lower on average despite improvement) increased substantially, this could offset the score improvements within each group.

Correct Answer Pattern: "The proportion of students from low-income families in Midville schools increased from 30% to 55% during this decade, as demographic patterns in the district shifted."

Why This Works: This is a classic Simpson's Paradox explanation. Both groups improved, but the composition of the overall population shifted toward the lower-scoring group. Even though low-income students' scores rose, they still scored below the district average, so having more of them in the mix kept the overall average stable despite within-group improvements. The compositional change explains why aggregate statistics can remain flat while all subgroups improve.

Connection to Learning Objectives: This example demonstrates how explaining statistics requires recognizing that aggregate data can mask subgroup patterns, and how compositional changes serve as powerful explanatory mechanisms for counterintuitive statistical results.

Exam Strategy

When approaching LSAT explaining statistics questions, begin by carefully identifying what makes the statistical pattern surprising, counterintuitive, or apparently contradictory. The question stem and stimulus will signal this, often using phrases like "apparent discrepancy," "surprising result," or "although... nevertheless." Your task is to find the answer that makes this surprising pattern expected and comprehensible.

Trigger words and phrases that signal explaining statistics questions include:

  • "Which of the following, if true, best explains..."
  • "Which of the following best accounts for..."
  • "The apparent discrepancy described above is most helped by which of the following..."
  • "Which of the following, if true, most helps to resolve the paradox..."
  • "The situation described above is best explained by which of the following..."

Develop a systematic approach: (1) Identify the specific statistical pattern that needs explanation, (2) Determine what makes it surprising by considering what you would normally expect, (3) Consider what hidden factors could account for the difference between expectation and reality, (4) Evaluate each answer choice by asking whether it makes the pattern comprehensible.

Process-of-elimination strategies specific to this question type:

  • Eliminate answers that merely restate the statistics without explaining them
  • Eliminate answers that explain the wrong aspect of the data (e.g., explaining why one number is high when the question asks about a comparison between two numbers)
  • Eliminate answers that introduce irrelevant information, even if that information is related to the general topic
  • Eliminate answers that would make the discrepancy worse rather than resolving it
  • Be suspicious of answers that seem to explain too much or that would completely eliminate the measured effect rather than explaining its pattern

Time allocation: These questions typically require 60-90 seconds. Spend 20-30 seconds fully understanding the statistical pattern and what makes it surprising, then 30-50 seconds evaluating answer choices. Don't rush the initial analysis—misunderstanding what needs explanation leads to selecting answers that address the wrong issue.

Exam Tip: The correct answer will almost always introduce new information not mentioned in the stimulus. If you find yourself choosing an answer that only rearranges or restates information already provided, reconsider whether it truly explains the pattern.

Memory Techniques

BASE - Remember the most common explanatory patterns:

  • Base rate effects (different population sizes)
  • Aggregation hiding subgroup patterns
  • Sampling bias (measured group ≠ target population)
  • External factors (hidden variables)

The Denominator Detective: When you see percentages or rates, always ask "percentage of what?" or "rate per what?" The denominator often holds the key to explanation. Visualize yourself as a detective investigating what population the statistics actually represent.

The Composition Question: For any aggregate statistic, mentally ask "What is this made of?" and "Has the mixture changed?" Visualize a pie chart where the slices represent subgroups—if the slice sizes change, the overall average changes even if each slice's internal value stays constant.

Temporal Telescope: For questions involving change over time, visualize looking through a telescope at the past. Current statistics often reflect past conditions, not present ones. Ask "When did the cause operate?" and "When was the effect measured?"

The Hidden Variable Hunt: Train yourself to automatically think "What factor isn't mentioned here?" when you see surprising statistics. The explanation usually involves something the stimulus didn't tell you about.

Summary

Explaining statistics represents a critical LSAT Logical Reasoning skill that requires understanding how aggregate data can produce counterintuitive patterns through base rate effects, sampling bias, compositional changes, multiple factor interactions, and temporal lag. Success on these questions depends on recognizing that statistics represent aggregated information where important details about subgroups, populations, and underlying mechanisms can be obscured. The correct explanation typically reveals a hidden factor or overlooked aspect of how the data was collected, calculated, or interpreted that makes an apparently surprising pattern comprehensible and expected. Rather than requiring advanced statistical knowledge, these questions test logical reasoning about what factors could account for numerical patterns, rewarding careful analysis of what makes data surprising and what additional information would resolve apparent discrepancies. Mastering this topic enhances performance across multiple question types while building analytical skills essential for legal reasoning.

Key Takeaways

  • Explaining statistics questions ask you to identify factors that make counterintuitive numerical patterns comprehensible and expected
  • Base rate effects (different population sizes) and compositional changes (shifts in group makeup) are among the most common explanatory mechanisms tested
  • The correct explanation introduces new information that specifically addresses what makes the statistical pattern surprising
  • Sampling bias explanations reveal how the measured group differs systematically from the population of interest
  • Aggregate statistics can mask subgroup patterns, producing apparent paradoxes that disappear when underlying structure is revealed
  • Wrong answers often restate the data, explain irrelevant aspects, or introduce information that doesn't resolve the specific discrepancy
  • Success requires identifying exactly what needs explanation before evaluating answer choices

Causation and Correlation: Understanding the distinction between causal relationships and mere statistical associations deepens your ability to evaluate whether explanations genuinely account for observed patterns or simply describe coincidental relationships. This topic extends explaining statistics by focusing on the mechanisms that produce statistical patterns.

Strengthen and Weaken Questions: Many strengthen/weaken questions involve statistical evidence, and the skills developed in explaining statistics directly transfer to evaluating what information would make statistical arguments more or less convincing. Mastering explaining statistics provides the foundation for these related question types.

Necessary and Sufficient Assumptions: Explaining statistics questions often implicitly test understanding of what assumptions underlie statistical interpretations. Recognizing these assumptions connects to the broader skill of identifying what must be true for an argument to work.

Formal Logic and Conditional Reasoning: While explaining statistics focuses on numerical patterns, the logical structure of explanations often involves conditional relationships (if X, then Y). Understanding formal logic enhances your ability to evaluate whether proposed explanations genuinely account for observed patterns.

Practice CTA

Now that you've mastered the conceptual framework for explaining statistics, it's time to apply these skills to actual LSAT questions. The practice questions and flashcards will reinforce your understanding of base rate effects, sampling bias, compositional changes, and other key explanatory patterns. Each practice problem you work through strengthens your ability to quickly identify what makes statistical patterns surprising and to recognize the answer choices that best resolve apparent discrepancies. Remember: explaining statistics is a high-yield topic that appears consistently on every LSAT, so your investment in practice will directly translate to points on test day. Approach each practice question systematically, and you'll develop the pattern recognition skills that make these questions manageable and even predictable.

Key Diagrams

Ready to practice Explaining statistics?

Test yourself with LSAT flashcards and practice questions — free on AnvayaPrep.

Frequently Asked Questions