DataCamp, dplyr, and blended learning


As I’ve written about in previous posts (here, here, and here), this semester I taught a course called Advanced Data Analysis for the Social Science, which is the second course in our department’s required sequence for Ph.D. students.  I’ve taught this course in the past, and in teaching the course this time, I tried to modernize it both in content and in form.  Therefore, I partnered with DataCamp to make their dplyr course, taught by Garrett Grolemund, available to my students.   This combination of face-to-face teaching and online content is called blended learning, and it’s something that I’d like to explore more in future classes.  For a first attempt, I think it worked pretty well, and the people at DataCamp were very helpful.  Here’s more about what happened.

DataCamp is an online learning platform focused on data science.  They offer courses on a variety of topics, but most are focused on R and variety of R packages.  I was happy to see that they offer a course on dplyr, a wonderful data manipulation package by Hadley Wickham and colleagues.  dplyr is very well designed, but it takes some getting used to because it works differently (some might say better) than the way things work in base R.  So, like a traditional class, we had a face-to-face lab session on dplyr and a homework assignment that required students to use it (here’s our class syllabus).  But, I knew that the students would need more practice if they were going to becoming truly fluent in dplyr.  So, in addition to our traditional class activities,  I offered the students the chance to take Garrett Grolemund’s dplyr course.  The course consists of 5 chapters, each with videos and instantly graded exercises.  I had taken Garrett’s class myself, and I found the exercises to be really, really helpful for practicing dplyr’s style of thinking.

How did my students respond?  About half the students started Garrett’s course, and, of the students that started, a bit less than half pretty much completed the whole thing.


Is that a success or failure?  I’d say a success because Garrett’s course was not required and it was quite long (about 4 hours). In other words, by offering this enrichment about a quarter of the class ended up spending more time learning.  I’ve redacted names from the plot, but the students that were most engaged with Garrett’s course were an interesting mix; it was not just the strongest students or the struggling students.  If this kind of enrichment were offered more regularly, I wonder which students would be the most likely to take it up.

Finally, I’d like to thank Martijn Theuwissen from DataCamp for helping to make this experiment in blended learning possible.  I’d definitely like to try this again next time I teach a course on data analysis.

5 thoughts on “DataCamp, dplyr, and blended learning

  1. Great to see dplyr used in teaching.

    I teach R-Intro and advanced courses since in psychology 6 years…

    At the beginning I had to mimic SPSS-style to avoid R-users to be regarded as pariahs.

    In the next phase I had to convice stundents to live with [ …, … ] constructs as well as to unpuzzle expressions ending with )))) plus getting rid of lots of temporary variables constructed to avoid expressions ending with )))))))) 🙂

    Now dplyr verbs and piping. A new world. Clean code to be read like a story. Few traps and due to data_frame & tbl results easy to check on the way.

    The only difference I use is the assignment at the end. Since the result will go -> here

    Greetings from Germany,


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s