Descriptive statistics summarize and organize characteristics of a dataset.
- Mean: Average of the data
- Median: Middle value when sorted
- Mode: Most frequent value
- Standard Deviation (SD): Spread of data around the mean
- Variance: Squared standard deviation
- Range: Difference between max and min
- Interquartile Range (IQR): Range of the middle 50% of the data
Inferential statistics make predictions or inferences about a population based on a sample.
- Hypothesis Testing: T-tests, ANOVA
- Confidence Intervals: Estimate of population parameters
- P-value: Probability of obtaining observed results if the null hypothesis is true
- Effect Size: Quantifies the magnitude of a difference (e.g., Cohen's d)
- Normal Distribution: Symmetrical, bell-shaped
- Binomial Distribution: Discrete distribution for binary outcomes
- Poisson Distribution: Counts of events in a fixed interval
- Exponential Distribution: Time between events in a Poisson process
- Correlation: Measures linear relationship between variables (e.g., Pearson's r)
- Simple Linear Regression: One independent variable predicting one dependent variable
- Multiple Linear Regression: Multiple predictors
- Chi-square Test: Tests association between categorical variables
- Contingency Table: Frequency distribution for categorical variables
- Odds Ratio / Relative Risk: Measures of association
- Normality: Data should be normally distributed for parametric tests
- Homoscedasticity: Equal variances across groups
- Independence: Observations should be independent
- Random Sampling: Every member has an equal chance
- Stratified Sampling: Dividing population into subgroups
- Cluster Sampling: Randomly selecting entire groups
| Test | Use Case |
|---|---|
| T-test | Comparing two group means |
| ANOVA | Comparing multiple group means |
| Chi-square test | Association between categorical variables |
| Mann-Whitney U | Non-parametric test for two groups |
| Kruskal-Wallis | Non-parametric ANOVA |