The Data Preparation and Workflow Management course is an open-source Master level course taught at Tilburg University. All of its content is freely available and consists of lectures, live streams, self-study material, tutorials and examples.
This course teaches you how to engineer data sets for statistical analysis. Many students and researchers perceive the process of “creating” a data set for analysis as rather simplistic: a bit of cleaning here, a bit of merging there, and you’re done. In this course, we take data preparation to the next level, by considering highly complex data preparation workflows (think multiple sources, structured and unstructured data, data from databases and data from files, multiple delivery batches, lots of missing data, different file versions, etc.).
And of course, throughout the course, we’ll be using the workflow principles of reproducible science that we advocate at Tilburg Science Hub.