Search courses or pages...
Start with cases, variables, units, tables, and the difference between data, information, and claims. You will practice turning everyday questions into data that can actually be analyzed.
Work with counts, categories, measurements, dates, and ordered ratings. This chapter shows how variable type controls the summaries, graphs, and models that make sense.
Use spreadsheets, data frames, code notebooks, and tidy data rules to keep datasets readable and safe to change. You will clean names, spot entry errors, and document what each column means.
Use dotplots, histograms, bar charts, boxplots, and scatterplots to see shape, spread, outliers, and relationships. The goal is to notice patterns before calculating too much.
Calculate means, medians, proportions, percentiles, variance, standard deviation, and z-scores. You will connect each number to the real-world question it helps answer.
Separate natural variation, measurement error, bias, and random noise. This chapter builds the habit of asking what could have produced the data besides the first explanation that comes to mind.
Follow how statistics grew from censuses, gambling problems, astronomy, agriculture, public health, industry, computing, and data science. This history explains why today’s methods care so much about uncertainty, design, evidence, and reproducibility.
Use events, sample spaces, probability rules, independence, and conditional probability. You will build probability statements from plain-language situations instead of memorizing formulas blindly.
Work with random variables, expected value, variance, and common probability models. Binomial, geometric, Poisson, uniform, normal, exponential, and related distributions become tools for matching data to real processes.
Use joint, marginal, and conditional distributions to describe several variables at once. This chapter covers covariance, correlation, dependence, and why association alone does not prove cause.
See why random samples behave predictably even when individual observations do not. You will use sampling distributions, standard error, and the central limit theorem as the bridge from data to inference.
Estimate unknown quantities with point estimates, margins of error, and confidence intervals. You will practice saying what an interval does and does not guarantee.
Use null hypotheses, test statistics, p-values, significance levels, and errors to judge whether data are surprising under a claim. This chapter also covers practical significance so results do not become empty rituals.
Compare means, proportions, rates, and paired measurements. You will choose between common one-sample, two-sample, paired, and proportion tests based on the structure of the data.
Plan sample sizes before collecting data and interpret power after a study is done. This chapter shows how effect size, noise, cost, and ethical limits shape a realistic study.
Use random assignment, control groups, blocking, blinding, factorial designs, and replication. You will see how good experiments create stronger evidence than analysis alone can rescue.
Build questionnaires, sampling frames, stratified samples, cluster samples, weights, and nonresponse checks. This chapter covers the practical choices behind polls, official statistics, and field surveys.
Use bootstrap intervals, permutation tests, and simulation when formulas are hard or assumptions are shaky. You will resample real datasets to measure uncertainty directly.
Fit and interpret simple linear regression with slopes, intercepts, residuals, fitted values, and uncertainty. You will connect a line on a graph to a claim about prediction or association.
Add several predictors, categorical variables, interactions, and transformations to regression models. This chapter teaches adjustment, confounding control, and the danger of over-reading coefficients.
Check residual plots, leverage, outliers, multicollinearity, nonlinearity, unequal variance, and influential cases. You will revise models based on evidence instead of treating software output as final truth.
Model yes-or-no outcomes, counts, rates, and other non-normal data with logistic, Poisson, and related models. This chapter turns odds ratios, rate ratios, and predicted probabilities into plain language.
Compare several groups with ANOVA, planned contrasts, and multiple-comparison controls. You will connect these tools to experiments, product tests, classrooms, farms, and clinical studies.
Analyze contingency tables, chi-square tests, Fisher’s exact test, risk differences, risk ratios, and odds ratios. This chapter supports work with surveys, medical studies, quality checks, and social data.
Use rank-based tests and robust summaries when outliers, skew, or small samples make standard methods fragile. You will know when a simpler assumption-light method is the better choice.
Work with clustering, principal components, factor analysis, and multidimensional scaling. This chapter shows how to reduce many variables into patterns people can inspect and discuss.
Analyze repeated observations over time with trend, seasonality, autocorrelation, smoothing, ARIMA ideas, and forecast checks. You will learn why time order changes the rules of ordinary regression.
Analyze time-to-event outcomes with censoring, Kaplan-Meier curves, hazard rates, and Cox regression. This chapter applies to medicine, reliability, churn, recidivism, and any setting where timing matters.
Handle nested and repeated data with random effects, partial pooling, and hierarchical models. You will model classrooms, clinics, stores, regions, users, and experiments where observations come in groups.
Use prior information, likelihoods, posterior distributions, credible intervals, and Bayes factors. This chapter builds Bayesian reasoning from probability rules and shows how it differs from frequentist inference.
Fit Bayesian models with MCMC, diagnostics, posterior predictive checks, and sensitivity analysis. You will see how tools like Stan, PyMC, and brms made complex Bayesian modeling practical.
Use DAGs, confounders, colliders, mediators, randomized trials, natural experiments, matching, weighting, difference-in-differences, instrumental variables, and regression discontinuity. This chapter gives a practical language for cause-and-effect claims.
Frame observational studies as if they were trials by defining eligibility, treatment, timing, outcomes, and follow-up. This modern workflow helps avoid hidden biases in health, policy, business, and platform data.
Detect missingness patterns and use complete-case analysis, weighting, single imputation, and multiple imputation. You will judge when missing data are a nuisance, a threat, or the main story.
Fit models with many predictors using cross-validation, ridge, lasso, elastic net, and feature selection. This chapter covers the high-dimensional problems common in genomics, text, sensors, marketing, and finance.
Use train-test splits, loss functions, calibration, ROC curves, precision-recall curves, and model comparison. You will connect statistical judgment to predictive modeling without treating prediction as magic.
Compare decision trees, random forests, gradient boosting, nearest neighbors, and support vector machines with statistical modeling habits. This chapter shows where machine learning extends statistics and where it creates new risks.
Analyze randomized online experiments with metrics, assignment units, guardrails, peeking risks, sequential tests, and heterogeneous effects. You will see how A/B testing works in products, marketing, policy pilots, and service delivery.
Use reproducible code, version control, notebooks, scripts, environments, data validation, and literate reports. This chapter turns one-off analysis into work another person can rerun and audit.
Protect people and organizations with de-identification limits, consent, secure data handling, fairness checks, and differential privacy. You will handle sensitive data with methods that match the risk.
Present uncertainty with interval plots, prediction displays, clear tables, plain-language caveats, and decision-focused summaries. This chapter helps you communicate results to people who must act on them.
Move from a real question to study design, data collection, cleaning, analysis, validation, reporting, and follow-up decisions. This end-to-end chapter ties together the habits of a working statistician.
Find mistakes before they spread by checking data lineage, assumptions, code, peer review, preregistration, replication, and sensitivity analyses. You will practice professional skepticism without becoming paralyzed.
Map the field’s paths in biostatistics, official statistics, social science, sports, finance, industry, data science, and research. This chapter covers portfolio projects, graduate routes, common tools, professional groups, certifications, and habits for staying current.