Probability: The foundation for generative modeling

Any data set we encounter was generated by some process, usually a process that involved the ingenuity, blood, sweat, and tears of an experimenter. It we want to learn something more general about nature from acquired data, we need to have a model for the data generation process. Rob Phillips said it beautifully in his book Physical Biology of the Cell, “Quantitative data demand quantitative models.” We call models that describe the process of generating data generative models.

We will see in the following lessons that building generative models requires the mathematical machinery of probability. We will proceed with a lack of formality, but will nonetheless give useful working definitions of probability and aspects thereof with an eye for putting them to use for modeling and interpreting data.