February 21 '13

The Auto Grader

A common problem in introductory courses like 6.00 Introduction to Computer Science and Programming occurs when it is time to grade assignments. In order to provide students with useful feedback on the mistakes they have made and how they can improve their work, teaching assistants (TAs) spend hours reviewing each assignment for errors. In an effort to remove the pain of conducting a lengthy review process, Assistant Professor Armando Solar-Lezama and his student Rishabh Singh of the Computer Assisted Programming Group at CSAIL, in collaboration with Sumit Gulwani at Microsoft Research, have developed the Auto Grader, a new tool designed to review faulty code and automatically provide feedback on how to fix it.

Reviewing each student assignment is not only time consuming for TAs, but it is also often difficult to recognize small errors in an assignment as coding problems can often be solved in a myriad of different ways. Human creativity is a stumbling block when it comes to designing a system that can automatically identify and fix problems in code.

“You can’t use a technique like pattern matching to identify errors because different students are going to try out different solutions to the problem,” said Solar-Lezama. “Even when you are working with a small number of student assignments, for the TAs it’s very hard to look at different pieces of code that look completely different than the solution they had in mind and figure out what the errors are and what small fix will solve it.”

Solar-Lezama and his team collaborated with Microsoft Research to pilot Auto Grader on the Microsoft site PexForFun.com, a website that allows individuals around the world to try their hand at solving different programming problems. When the Auto Grader was applied to a variety of code submitted to the site, the system was able to consistently identify errors and propose solutions for repairing the code.

For example, one assignment asked participants to reverse an array. One submission looked completely different from the standard solution; however, the Auto Grader was able to locate two errors and pinpoint where changes could be made to fix the code.

The Auto Grader team also explored the cost effectiveness of using an automated grading system as opposed to TAs. They found that to grade assignments from 100,000 students, the price of using an automated system was about $10 to $20 per assignment, which is much cheaper and more efficient than using TAs, according to Solar-Lezama.

Currently, the Auto Grader system is being tested on previous 6.00 homework submissions, and Solar-Lezama is planning to introduce the program to the course within the next year. In the future, Solar-Lezama, whose research focuses on synthesizing programs, hopes to leverage the technology developed for the Auto Grader and scale it for use as a form of personalized tutor. He hopes that the program could be used to not only identify errors in code, but also to suggest problems and concepts that will teach students how to fix and identify coding errors on their own, just like a teacher would. This sort of program could prove especially useful for online education, according to Solar-Lezama.

“If you think about online education and all the people from all over the planet who are going online and trying to learn new material, a lot of these people might not have a support group right there to help them through their problems. They might go and ask some question in a forum, which raises the issue of whether or not they will receive the right level of feedback,” said Solar-Lezama. “Having some automated way to walk them through that process the same way a good instructor would could be transformative.”

For more information on Solar-Lezama’s work, please visit: http://people.csail.mit.edu/asolar/.

Abby Abazorius, CSAIL