Mining the edX data


Faculty Supervisor: Una-May O-Reilly

We are looking for 2-3 UROPs for the fall, hopefully extending to a longer research partnership. We are a part of the ALFA group in CSAIL. We are doing data mining, analytics on large amounts of data emanating from massive open online courses, MOOCs (edX primarily and possibly coursera). We have spent about a year organizing and getting data. We are starting a few projects to build analytics framework to easily query and visualize this data. We would like to mine this data to answer a number of questions like (not limited to): What is the average amount of time spent on different modules of the course? How much time did students spend on videos vs. book? What is the average number of attempts for a home work problem? These questions when answered on multiple offerings of a course allows us to gain knowledge about how many different types of learners are there? What parts of the course do students find difficult and struggle? This kind of in-depth data about students interaction with courses is available for the first time ever.

The ideal candidate will have some programming experience, feels comfortable with (or willing to learn) python, MATLAB and MySQL, and is willing to commit ~10 hours/week. Experience working with sql and an interest in online education is a huge plus. The UROPs will work closely with a team of M.Eng/PhD students and a postdoc in the group.

* Preferred but not required - The UROP candidates should be at the sophomore level or above.


Please send a 1 page resume and short paragraph describing background
and interest in the project.