burger

Practicing Pipeline Automation using Make

^{11 mins}

Pipeline Automation Overview

Verify Software Setup

Download the Template

Running the Workflow

Inspecting the Data-preparation Pipeline

Dry-run of Make and First Modifications

More Modifications

Extending the Analysis

Wrap-Up

Let’s modify the analysis

We’re done with the data-preparation pipeline for now, and turn our attention to the analysis part of our project.

Admittedly, there’s not much here yet. Try fiddling around with the files a bit when you proceed to our practice questions and answers below.

Now let’s continue with a couple of modifications. You can directly start working on the practice questions below.

Practice questions and answers

Recall the powerful makefile we’ve introduced to you a while ago? Well, open the makefile in src/analysis/ now and try to understand the steps of this stage of the pipeline! What happens, exactly?
Let’s open preclean.R (e.g., in RStudio). Try to understand what this script does. Then, filter the data only for tweets with polarity>0.
Last, provide some summary statistics (summary(dt$nwords)) of the word count, and produce a histogram hist(dt$nwords) of it in the RMarkdown document.

Watch the solution here

Suggest changes to this page

Go back

Previous More Modifications

Continue reading Wrap-Up Next