Applied Statistics: From Data to Decisions
Monday, March 30, 2026
2,217 hospitals penalized for excess readmissions. which ones actually have a quality problem — and which just serve sicker patients?
source: CMS Hospital Readmissions Reduction Program
NBA shot locations, 2003 vs 2024. why did players eliminate the mid-range jumper?
source: Kirk Goldsberry / NBA shot chart data
Wealthfront manages $50B. which clients should harvest losses today to save on taxes?
source: Wealthfront blog, “10 Years of Tax-Loss Harvesting”
World Food Programme, 2018. how to reallocate food baskets in Yemen to feed 2M more people at the same cost?
source: WFP HungerMap LIVE / Zero Hunger Lab
NextEra is siting 4.5 GW of new solar across four states. which parcels maximize energy per dollar?
source: NREL Solar Resource Data (public domain)
Pfizer, November 2020. do 8 vs 162 cases in 43,000 patients prove 95% efficacy — enough for emergency authorization?
source: Polack et al., NEJM 2020
North Carolina, 2016. was a 10–3 Republican sweep from 53% of votes gerrymandering — or geographic luck?
source: Mattingly et al., Duke University
Zillow’s algorithm bought 9,790 homes in Q3 2021. why did it overpay on nearly all of them?
source: Yahoo Finance
Broward County uses COMPAS scores to set bail. why is the false positive rate 45% for Black defendants vs 23% for white?
source: ProPublica Machine Bias analysis
80% of Netflix viewing comes from recommendations. can a $1M algorithm improve predictions enough to save $1B/yr in churn?
source: Netflix Prize / matrix factorization
data, model, decision
academic
applied
every one required: data, model, decision
poll: PollEv.com/madeleineudell824

what kinds of consequential decisions do you expect to make in your career? what data will you have? what uncertainties will you face?
2 min think. 3 min share with a neighbor. 2 min class discussion.
can you trust it?
an AI hands you an analysis that says Hospital X should be fined. what questions do you ask before signing off?
2 min think. 3 min share with a neighbor. 2 min class discussion.
let’s dig into this one
ERR = predicted readmissions / expected readmissions
above 1.0 → more readmissions than expected → penalty
your hospital’s ERR is 1.05. why might your readmissions be so high? what questions would you ask, or what data would you gather, to understand why — and to figure out what you might do to lower them?
2 min think. 3 min share. 2 min class.
back to the notebook — .isna().sum()
which patients are most likely to be in the “too few” category? what does that mean for fairness?
2 min think. 3 min share. 2 min class.
same dataset, four different questions:
| question | decision | |
|---|---|---|
| summary | what does the ERR distribution look like? | which hospitals are outliers? |
| prediction | given a hospital’s traits, what ERR to expect? | should CMS flag this hospital? |
| inference | is an ERR of 1.05 real or noise? | should the hospital be fined? |
| causation | do fines actually reduce readmissions? | should CMS continue the program? |
I
build models
explore, clean, predict
regression, trees, features
Lec 1–7
II
trust models
sample, test, infer
bootstrap, hypothesis tests
Lec 8–12
III
see further
classify, cluster, cause
PCA, causal inference
Lec 13–19
| question | topic | act |
|---|---|---|
| hospital readmission penalties | EDA, hypothesis testing | I → II |
| NBA shot selection | EDA, conditional expected value | I |
| Wealthfront tax-loss harvesting | optimization, regression | I |
| WFP food allocation | linear algebra, optimization | I |
| NextEra solar farm siting | feature engineering, regression | I |
| Pfizer vaccine efficacy | hypothesis testing, multiple testing | II |
| NC gerrymandering | permutation tests | II |
| Zillow’s iBuying algorithm | regression, backtesting | II |
| COMPAS bail scores | classification, fairness | III |
| Netflix recommendations | PCA, SVD | III |
many fields study the how to make consequential decisions with data:
each has its own methods, tools, and culture. they overlap significantly!
every example in the montage is a consequential decision


you’ll have the tools to answer all of these
