Three Types of Bayesian Forgetting for Online Learning
If you’re running a Bayesian model in a non-stationary environment, you need to forget old data. The obvious approach – scale the precision matrix by a constant – has a failure mode called covariance windup. This post works through three forgetting rules, ending with one borrowed from adaptive control that dominates the others.