Welcome to the First Class of Econometrics!

Natasha Kang

Xiamen University, Chow Institute

March, 2026

What is Econometrics?

  • Coined by an influential Norwegian economist Ragnar Frisch in the 1930s.
  • “Econo-” from economics
  • “-metrics” from the Greek word metron, meaning measurement
  • It integrates economic theory, data, and statistical methods to understand real-world behavior.
  • Transforms economics from a qualitative discipline into a scientific, evidence-based field.

From Witchcraft to Science

  • Econometrics helps bring rigor and testability to economic thinking.
  • It gives economists the tools to analyze data and evaluate theories with evidence.
  • Still, economics faces unique challenges absent in the “hard sciences” (e.g., no labs, behavioral complexity).

What Makes Econometrics Challenging?

  • No labs: Economists can’t run controlled experiments the way chemists can — we observe the world as it is, not as we’d design it.
  • Everything affects everything: Education correlates with family background, income correlates with location. Isolating the effect of one factor is like untangling headphones.
  • People aren’t passive: A government taxes luxury goods to raise revenue — but consumers switch to substitutes, and revenue falls short. The behavior changed because of the policy.
  • All of this makes it difficult to establish causal relationships in economics.

Appreciate both the power and the limits of econometrics.

Why Study Econometrics?

  • Research: Test economic theories with data — e.g., does education cause higher income?
  • Policy: Evaluate interventions — e.g., did a tax cut stimulate growth, or would growth have happened anyway?
  • Industry: High demand for data analysis skills in finance, tech, consulting.
  • But AI can already do all of this — run regressions, interpret coefficients, even discuss identification.
So why bother learning it yourself?

How Fast Things Change

  • 2022: ChatGPT launches. Impressive, but the critique is clear: ”It’s just pattern recognition — it doesn’t understand, it doesn’t ask why.”
  • 2023: GPT-4 and Claude arrive — they pass bar exams, write code, explain proofs. But they still hallucinate confidently.
  • 2024: Reasoning models emerge (OpenAI o1, Claude 3.5). AI begins to assess identification strategies and critique research designs.
  • 2025: Agentic AI (Claude Opus 4, o3) — autonomous research, extended reasoning, tool use. AI scores 87% on PhD-level science benchmarks.
  • 2026: …?

”Learn econometrics because AI can’t do it”

expires.

”Learn to think clearly”

doesn’t.

Econometrics as Cognitive Training

  • Knowing econometrics isn’t about being faster than a computer. It’s about being able to evaluate what the computer tells you.
  • An AI can produce a convincing analysis with a fatal flaw — wrong identification, violated assumptions, spurious correlation — and present it with complete confidence.
  • If you can’t assess the reasoning yourself, you have to take it on faith. That’s not analysis — that’s religion.
  • Intellectual autonomy: understanding what you know is what lets you recognize what you don’t — and that’s where real inquiry begins.

Two Common Pitfalls

1. Correlation is not causation.

  • Countries with more Nobel laureates consume more chocolate. Should we eat more chocolate?
  • The data is real. The pattern is real. The conclusion is nonsense.

2. The past does not necessarily predict the future.

  • A model that perfectly explains last year’s stock returns tells you nothing about next year’s.
  • A good fit is not a good forecast.

This course trains you to catch both.

Learning Outcomes

  • Recognize when a pattern in data reflects a causal relationship — and when it doesn’t.
  • Understand why a model that fits the past may fail in the future.
  • Develop the judgment to evaluate empirical evidence — your own, others’, and AI’s.

Course Logistics

  • My office hours:
    • Tuesdays: 2:00 PM – 4:00 PM (Room B211-A)
  • Course Materials:
    • No required textbook
    • Ask AI, learn from AI, but own your reasoning
  • Grading Policy:
    • In-Class Quizzes: 25%
    • Midterm Exam: 30%
    • Final Exam (comprehensive): 45%

Course Roadmap: Steps in Empirical Analysis

  1. Question — Define the research question or economic problem.
  1. Econometric Model — Specify a model to relate variables of interest.
  1. Hypothesis — Formulate a testable hypothesis about economic relationships.
  1. Estimation — Use data to estimate model parameters.
  1. Inference — Test hypotheses and draw conclusions about the population.

Where Does Data Come From?

Data can come from two primary sources:

  • Experimental Data
    • Collected in controlled environments (e.g., labs)
    • Common in natural sciences
    • Rare in economics due to cost or ethical concerns
  • Observational Data
    • Collected via surveys or field data without control over assignment
    • Common in economics and social sciences
    • Poses challenges for causal inference (e.g., endogeneity, omitted variables)

Cross-sectional Data

  • Snapshot at a single point in time
  • Observations are independent across units (e.g., individuals, firms)

Pooled Cross Sections

  • Multiple cross-sectional samples from different time periods
  • Different individuals in each round

Time Series Data

  • Observations on the same variable over time
  • One unit (e.g., country, stock, region)

Panel (Longitudinal) Data

  • Track the same units over multiple time periods
  • Richer structure allows control for unobserved heterogeneity

Summary: Types of Data

Data can be classified into four main types:

  • Cross-sectional: 1-time snapshot across many units
  • Pooled Cross Sections: multiple snapshots across time, different units
  • Time Series: 1 unit over time
  • Panel: same units tracked over time

From Population to Sample

  • We rarely observe the entire population — we work with samples.
  • A random sample: every unit has an equal chance of being selected → \(Y_i\) are i.i.d.
  • Sample must be representative and large enough for reliable inference.

Practical Considerations

  • Requires a complete list of the population (sampling frame).
  • Nonresponse or practical challenges may still introduce bias.

Sampling Variability

  • Even well-drawn samples give different estimates each time.

Example: Estimating average household income

  • Sample 1: ¥10,800
  • Sample 2: ¥11,400
  • Sample 3: ¥11,000
  • This is why we don’t just report a number — we need standard errors and confidence intervals.

What’s Next?

Course Roadmap: Steps in Empirical Analysis

  1. Question: Define the research question or problem.
  2. Econometric Model: Specify the model for data analysis.
  3. Formulate a Hypothesis: Develop a testable hypothesis of interest.
  4. Estimate the Model with Data: Use econometric methods to estimate model parameters.
  5. Inference/Hypothesis Testing: Perform statistical tests for making inferences about the population.

Research Question: Prediction vs. Causality

  • Prediction:
    • Focuses on forecasting an outcome using observed data.
    • Pattern recognition and trend analysis.
    • Example: Predicting next quarter’s sales from past trends.
  • Causality:
    • Aims to establish cause-and-effect relationships.
    • Intervention analysis and counterfactual reasoning.
    • Example: The effect of a policy change on unemployment.
  • Both types of questions are important, but they require different methods and assumptions in empirical analysis.

Causal Inference in Economics

  • Essential for policy design and assessing economic impacts.

Challenges in Causal Inference

  • Internal Validity: Are we estimating the true causal effect within our sample?
    • Requires ruling out alternative explanations (confounders).
    • Example: Does education really cause higher income, or is it just correlated with ability?
  • External Validity: Will our findings generalize to other settings?
    • Results may differ due to changes in population characteristics, local institutions, or implementation quality.
    • Example: A policy that worked in Beijing may not work in rural Yunnan.

Any Questions?