The more that I read up on this and think about it, the more I think this will be a huge project. There is really a lot to concern about. In an ideal world, this is what automated testing would do once implemented.
- Students have a system at hand, which they can use to learn writing good tests. For instance they would check if their tests are correct, by running them against a reference implementation an instructor would put up at some point before the assignment due date.
- Students would learn which parts of the code they are testing (coverage). This would – again, we are in an ideal world – be reported visually in an Grader View like environment. With little boxes, as to where is unchecked code. See the mockup below.
- Students would also learn which areas of their code is particularly fragile. This could be achieved by running a static analysis tool on the code
- Students would have the opportunity to see public test results of tests the instructor has created and are marked public. The instructor uploads a public test file and could additionally specify a filter file, which is run on the test output. I can imagine these kinds of things would be shell scripts or Python scripts.
- Also, students have the possibility to run some more tests on their code, the restricted test suite as specified by the instructor. Say the instructor specifies that a student or group can run these tests at most 1 or 2 times a day. The idea is to encourage students to start their assignment early and to have lots of opportunity to run those restricted tests. For each restricted test, students would have to log into MarkUs and trigger their execution. We store in the database, how many restricted test runs are left for each grouping. Restricted tests can be filtered similarly as public test. By using some sort of script run on the test output.
- An instructor could specify a file, which has to successfully run in order to have the submission graded. This would be useful, if an instructor does not allow submissions which do not compile. Such failing “acceptance tests” would be displayed visually in the submissions view and the grader view.
- Graders have a visual report ready, when they start grading. They would see the percentage of failing and passing tests, respectively. See again the mockups for more information.
- Tests created by students (for instance to run them on a reference implementation), would be submitted via the web interface in a separate section, “Tests” say. The reason for this is to keep tests and code separate. I can imagine that these tests would go into a different folder in the groupings repository (e.g. /tests/a1). Alternatively, students can submit these tests via SVN command line client into the appropriate folder. Tests would be triggered via MarkUs’ interface in any case. This makes tracking of students testing activity and restricted test runs easier.
- Students control their test runs via MarkUs. They trigger runs, and get immediate feedback if they are not allowed to run the desired test or get feedback as to where in the testing queue their run is schedulred. Students can wait a while and check back later if there are test results available for them. Maybe we could plan for the option of email notifications for students once the test has been run and results are available.
- Automated testing would support graders in many aspects. All tests will be run which the instructor provided for the assignment. This includes public, restricted, private and basic acceptance tests. If students are required to write tests for their code, too, MarkUs could be configured to also run tests in a specific folder in the grouping’s repository (e.g. /tests/a1). All these tests results would be available to the grader in the grader view. Ideally, failures are already pre-annotated in the Grader View. If basic tests are failing (e.g. program does not compile), automatic deductions could be triggered.
I’ll stop for now. Here are some mocks:
If you have thoughts, please feel free to chime in. As I mentioned previously, I think this whole topic would be enough material for a master thesis. The big question remains: How would you make sure that you can integrate test runs into the Grader View and other points? It would be nice if test tools would have a standard output format. I can imagine some built in integration points in MarkUs, though. That’s it for now, stay tuned 🙂