Polars and split-apply-combine

Throughout your research career, you will undoubtedly need to handle data, possibly lots of data. Once in a usable form, you are empowered to make graphics, perform statistical inference, and use your data in AI applications. Tidy data is an important format which we will define and work with in this lesson.

In an ideal world, data sets would be stored in tidy format and be ready to use. But for practical considerations (and sadly also due to experimenter and instrument manufacturer thoughtlessness) data comes in lots of formats, and you may have to spend much of your time wrangling the data to get it into a usable format. The process of getting data into useful formats is called wrangling. We will not dive into that topic, but will assume data is in tidy, or at least easily useable, format.

In this lesson, we will use Polars to manipulate tidy data sets.