Welcome to the First Class of Econometrics!
Natasha Kang
Xiamen University, Chow Institute
March, 2026
What is Econometrics?
- Coined by an influential Norwegian economist Ragnar Frisch in the 1930s.
- “Econo-” from economics
- “-metrics” from the Greek word metron, meaning measurement
- It integrates economic theory, data, and statistical methods to understand real-world behavior.
- Transforms economics from a qualitative discipline into a scientific, evidence-based field.
From Witchcraft to Science
![]()
- Econometrics helps bring rigor and testability to economic thinking.
- It gives economists the tools to analyze data and evaluate theories with evidence.
- Still, economics faces unique challenges absent in the “hard sciences” (e.g., no labs, behavioral complexity).
What Makes Econometrics Challenging?
- No labs: Economists can’t run controlled experiments the way chemists can — we observe the world as it is, not as we’d design it.
- Everything affects everything: Education correlates with family background, income correlates with location. Isolating the effect of one factor is like untangling headphones.
- People aren’t passive: A government taxes luxury goods to raise revenue — but consumers switch to substitutes, and revenue falls short. The behavior changed because of the policy.
- All of this makes it difficult to establish causal relationships in economics.
Appreciate both the power and the limits of econometrics.
Why Study Econometrics?
- Research: Test economic theories with data — e.g., does education cause higher income?
- Policy: Evaluate interventions — e.g., did a tax cut stimulate growth, or would growth have happened anyway?
- Industry: High demand for data analysis skills in finance, tech, consulting.
- But AI can already do all of this — run regressions, interpret coefficients, even discuss identification.
So why bother learning it yourself?
How Fast Things Change
- 2022: ChatGPT launches. Impressive, but the critique is clear: ”It’s just pattern recognition — it doesn’t understand, it doesn’t ask why.”
- 2023: GPT-4 and Claude arrive — they pass bar exams, write code, explain proofs. But they still hallucinate confidently.
- 2024: Reasoning models emerge (OpenAI o1, Claude 3.5). AI begins to assess identification strategies and critique research designs.
- 2025: Agentic AI (Claude Opus 4, o3) — autonomous research, extended reasoning, tool use. AI scores 87% on PhD-level science benchmarks.
”Learn econometrics because AI can’t do it”
expires.
”Learn to think clearly”
doesn’t.
Econometrics as Cognitive Training
- Knowing econometrics isn’t about being faster than a computer. It’s about being able to evaluate what the computer tells you.
- An AI can produce a convincing analysis with a fatal flaw — wrong identification, violated assumptions, spurious correlation — and present it with complete confidence.
- If you can’t assess the reasoning yourself, you have to take it on faith. That’s not analysis — that’s religion.
- Intellectual autonomy: understanding what you know is what lets you recognize what you don’t — and that’s where real inquiry begins.
Two Common Pitfalls
1. Correlation is not causation.
- Countries with more Nobel laureates consume more chocolate. Should we eat more chocolate?
- The data is real. The pattern is real. The conclusion is nonsense.
2. The past does not necessarily predict the future.
- A model that perfectly explains last year’s stock returns tells you nothing about next year’s.
- A good fit is not a good forecast.
This course trains you to catch both.
Learning Outcomes
- Recognize when a pattern in data reflects a causal relationship — and when it doesn’t.
- Understand why a model that fits the past may fail in the future.
- Develop the judgment to evaluate empirical evidence — your own, others’, and AI’s.
Course Logistics
- My office hours:
- Tuesdays: 2:00 PM – 4:00 PM (Room B211-A)
- Course Materials:
- No required textbook
- Ask AI, learn from AI, but own your reasoning
- Grading Policy:
- In-Class Quizzes: 25%
- Midterm Exam: 30%
- Final Exam (comprehensive): 45%
Course Roadmap: Steps in Empirical Analysis
- Question — Define the research question or economic problem.
- Econometric Model — Specify a model to relate variables of interest.
- Hypothesis — Formulate a testable hypothesis about economic relationships.
- Estimation — Use data to estimate model parameters.
- Inference — Test hypotheses and draw conclusions about the population.
Where Does Data Come From?
Data can come from two primary sources:
- Experimental Data
- Collected in controlled environments (e.g., labs)
- Common in natural sciences
- Rare in economics due to cost or ethical concerns
- Observational Data
- Collected via surveys or field data without control over assignment
- Common in economics and social sciences
- Poses challenges for causal inference (e.g., endogeneity, omitted variables)
Cross-sectional Data
- Snapshot at a single point in time
- Observations are independent across units (e.g., individuals, firms)
Pooled Cross Sections
- Multiple cross-sectional samples from different time periods
- Different individuals in each round
Time Series Data
- Observations on the same variable over time
- One unit (e.g., country, stock, region)
Panel (Longitudinal) Data
- Track the same units over multiple time periods
- Richer structure allows control for unobserved heterogeneity
Summary: Types of Data
Data can be classified into four main types:
- Cross-sectional: 1-time snapshot across many units
- Pooled Cross Sections: multiple snapshots across time, different units
- Time Series: 1 unit over time
- Panel: same units tracked over time
From Population to Sample
- We rarely observe the entire population — we work with samples.
- A random sample: every unit has an equal chance of being selected → \(Y_i\) are i.i.d.
- Sample must be representative and large enough for reliable inference.
Practical Considerations
- Requires a complete list of the population (sampling frame).
- Nonresponse or practical challenges may still introduce bias.
Sampling Variability
- Even well-drawn samples give different estimates each time.
Example: Estimating average household income
- Sample 1: ¥10,800
- Sample 2: ¥11,400
- Sample 3: ¥11,000
- This is why we don’t just report a number — we need standard errors and confidence intervals.
What’s Next?
Course Roadmap: Steps in Empirical Analysis
- Question: Define the research question or problem.
- Econometric Model: Specify the model for data analysis.
- Formulate a Hypothesis: Develop a testable hypothesis of interest.
- Estimate the Model with Data: Use econometric methods to estimate model parameters.
- Inference/Hypothesis Testing: Perform statistical tests for making inferences about the population.
Research Question: Prediction vs. Causality
- Prediction:
- Focuses on forecasting an outcome using observed data.
- Pattern recognition and trend analysis.
- Example: Predicting next quarter’s sales from past trends.
- Causality:
- Aims to establish cause-and-effect relationships.
- Intervention analysis and counterfactual reasoning.
- Example: The effect of a policy change on unemployment.
- Both types of questions are important, but they require different methods and assumptions in empirical analysis.
Causal Inference in Economics
- Essential for policy design and assessing economic impacts.
Challenges in Causal Inference
- Internal Validity: Are we estimating the true causal effect within our sample?
- Requires ruling out alternative explanations (confounders).
- Example: Does education really cause higher income, or is it just correlated with ability?
- External Validity: Will our findings generalize to other settings?
- Results may differ due to changes in population characteristics, local institutions, or implementation quality.
- Example: A policy that worked in Beijing may not work in rural Yunnan.
Any Questions?