23 What about machine learning and artificial intelligence?

This course is about statistical inference, and, in the service of handling data sets for inference procedures, data science. But these days machine learning and artificial intelligence are at the forefront of any discussion of working with data. So, you may ask, “How is all of this statistical inference we are doing related to ML and AI?”

It is a fair question. The answer requires us to clarify what we mean by data science, statistical inference, machine learning, and artificial intelligence. As far as I can tell, there are no clear definitions of these terms, so I will give working definitions for our purposes to clarify what we are trying to do in this workshop.

In the previous section, we have just given a description of statistical inference, defining how it fits in the process of doing science, so we will proceed to define the other terms.

23.1 What is data science?

We will define data science as a giant catch-all for anything involving data, including but not limited to:

Data management
Data storage
Data organization
Data visualization
Statistical inference
Machine learning
Artificial intelligence

The possible exception is data acquisition, which falls under experimental science.

23.2 What is machine learning?

I like the simple definition put forward by Sabera Talukder in the first edition of Caltech’s DataSAI workshop, which I paraphrase as

Machine learning involves using data to inform machines how to perform tasks.

It is important to understand the italicized terms in the above. First data can be observations (quantitative, qualitative, or categorical), synthetic or measured. Some common tasks involve labeling data, categorizing (clustering) data, and making predictions. Informing a machine how to do tasks typically involves proposing a model for how data are generated and then finding good parameters for that model.

This definition of machine learning is a concise form of that put forth by Tom Mitchell in his classic book entitled Machine Learning.

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

“Experience” here is what we are calling data, and Mitchell’s word “learn” is what we have described as informing a machine. The added notion here is the performance measure which is a way of assessing how well a machine accomplishes a task. Mitchell says that the E, T, and P all need to be clearly defined to specify a machine learning problem.

As we will see, both invention of a generative model and the learning by (a.k.a. informing of) the machine are essentially problems in statistical inference. Furthermore, performance measures are often also defined in the statistical inference procedures.

In my view, then, machine learning is really statistical inference with a specific task in mind. As we have seen in our view of the cycle of science, given a model describing the data generation process, statistical inference provides us with a plausible set of parameter values as well as a measure of the plausibility of the model itself. We cross into machine learning when we use the parameters we have acquired to perform a task on new data.

23.3 What is artificial intelligence?

I view artificial intelligence as a subset of machine learning in which the tasks are those that have been historically reserved for humans minds. These involve use of language, recognition and creation of images, and even reasoning. These sound very advanced, but still ultimately involve parameterized generative models and then pushing new data through them. Hence, it is a subset of machine learning.

23.4 So why so much statistical inference?

The tricky parts of machine learning (and therefore also AI) are coming up with models and then finding reasonable parameters for them so that they are performant. These both are central to statistical inference, which is why we will focus so heavily on it.

One point of confusion will be terminology. Those who refer to themselves as practitioners of machine learning have one set of terminology and those who refer to themselves as statisticians have another. Beyond that, the terminology within both fields is often shorthand for an explicit model or an specific method of finding parameters. We will explicitly define models and describe techniques for working with them (not just show how to do it!).