Analysis of Variance (ANOVA)

ANOVA technique deals with ascertaining differences in groups within a population. The dependent variable is continuous and the independent variable is categorical.

Assumptions -

  • Data is normally distributed
  • Population and groups variance is homogenenous and have similar variance
  • Samples are random and independent

It uses the F-value statistic. It is the ratio of 'variance among groups'/'variance within group'. Higher F-value leads to rejection of null hypothesis. The null hypothesis states that there is no difference between groups. The alternate hypothesis states that at least one group is different.

One-way ANOVA: 1 continuous dependent; 1 categorical independent (> 2 levels)

(* Special case - t-test: 1 continuous dependent; 1 categorical independent (2 levels))

Two-way ANOVA: 1 continuous dependent; 2 or more categorical independent

Python Implementation: scipy.stats, statsmodels

Comments

Popular posts from this blog

Principal Component Analysis (PCA)

Transfer Learning

Receiver Operating Characteristic (ROC) Curve