In the age of large streaming data it seems appropriate to revisit the foundations of what we think of as data modelling. In this talk I’ll argue that traditional statistical approaches based on parametric models and i.i.d. assumptions are inappropriate for the type of large scale machine learning we need to do in the age of massive streaming data sets. I’ll be arguing for flexible non-parametric models as the answer. This presents a particular challenge, non parametric models require data storage of the entire data set, which presents problems for massive, streaming data. I’ll argue that recently proposed variational approximations allow us to retain the advantages of both non-parametric and parametric models within a consistent framework that performs an optimal compression of our data from an information gain perspective.