|
|
|
# Assignments
|
|
|
|
|
|
|
|
Assignments are programming tasks that can be tested by a worker after a user
|
|
|
|
submits their solution.
|
|
|
|
|
|
|
|
## Configuration format
|
|
|
|
|
|
|
|
An assignment is described by a YAML file that contains information on how to
|
|
|
|
build, run and test it. The testing process is divided into stages.
|
|
|
|
|
|
|
|
### Tests
|
|
|
|
|
|
|
|
A test is a pair of files, where the first one specifies an input and the other
|
|
|
|
one the expected output. Tests can be organized into groups for easier
|
|
|
|
organization and to facilitate more complex grading scenarios (e.g. "An
|
|
|
|
assignment passes only if it passes all tests in group X").
|
|
|
|
|
|
|
|
Tests are run by a special program called a *judge*, which compares the output
|
|
|
|
of the program to the expected output. By setting a judge, we can specify how
|
|
|
|
strict the testing is - for example, some assignments require the solution to
|
|
|
|
output exactly the same bytes as expected. Others permit any number of
|
|
|
|
whitespace characters between words of the output.
|
|
|
|
|
|
|
|
### Stages
|
|
|
|
|
|
|
|
A stage is a logical unit of the testing process. It specifies how to do a step
|
|
|
|
in the build process and how to test if it behaves correctly. The following is
|
|
|
|
contained in its configuration:
|
|
|
|
|
|
|
|
- **Name** - a unique string identifier of the stage
|
|
|
|
- **Build command** (optional) - used to prepare the submitted files for this
|
|
|
|
test stage
|
|
|
|
- **Test list** - specifies the tests (or test groups) to be run during this
|
|
|
|
stage
|
|
|
|
- **Test command** - used to run one specific test
|
|
|
|
- **Test input policy** - how to pass the test input to the program?
|
|
|
|
- redirect it to its standard input
|
|
|
|
- pass the name of the input file as an argument
|
|
|
|
- **Judge** - which judge should be used to evaluate the solution's output?
|
|
|
|
Custom judges can be supplied with the assignment.
|
|
|
|
- **Error policy** (optional) - what should we do when a test fails?
|
|
|
|
- **interrupt** the stage (default)
|
|
|
|
- **continue** with the next test
|
|
|
|
- **jump** to another stage (TODO cycle detection?)
|
|
|
|
- **Success policy** (optional) - what to do when all tests pass?
|
|
|
|
- **jump** to another stage (the next one by default)
|
|
|
|
- **end** the evaluation, even if there are still unprocessed stages
|
|
|
|
|
|
|
|
## Case study
|
|
|
|
|
|
|
|
We present some of the courses that might use ReCodEx to evaluate homework
|
|
|
|
assignments and outline the setup of the evaluation with respect to the concept
|
|
|
|
of stages.
|
|
|
|
|
|
|
|
### Simple programming exercises
|
|
|
|
|
|
|
|
For example introductory programming courses such as Programming I or
|
|
|
|
Programming in Java.
|
|
|
|
|
|
|
|
In the simplest case we only need one stage that builds the program and passes
|
|
|
|
the test inputs to its standard input. We will use the C language for this
|
|
|
|
example. The build command is `gcc source.c`, the test command is `./a.out`.
|
|
|
|
|
|
|
|
### Compiler principles
|
|
|
|
|
|
|
|
This course uses multiple tools in a pipeline-like fashion - for example `flex`
|
|
|
|
and `bison`.
|
|
|
|
|
|
|
|
We create a stage for each of the steps of this pipeline - we run flex and test
|
|
|
|
the output, then we run bison and do the same.
|
|
|
|
|
|
|
|
### XML technologies
|
|
|
|
|
|
|
|
In this course, students choose a topic they model using XML - for example a
|
|
|
|
library or a bulletin board. During the semester, they expand this project by
|
|
|
|
adding XSLT transformations, XQuery scripts, XPath queries, etc. These are
|
|
|
|
tested against fixed requirements (e.g. using some particular language
|
|
|
|
constructs).
|
|
|
|
|
|
|
|
This course already has a rather sophisticated application for testing homework
|
|
|
|
assignments, so we only include it for demonstration purposes.
|
|
|
|
|
|
|
|
Because every assignment focuses on a different technology, we would need a new
|
|
|
|
type of stage for each one. These stages would only run some checker programs
|
|
|
|
against the submitted sources (and possibly try to check their syntax etc.).
|
|
|
|
|
|
|
|
### Non-procedural programming
|
|
|
|
|
|
|
|
This course is different from other programming courses, because it only teaches
|
|
|
|
input/output manipulation by the end of the semester. In their assignments,
|
|
|
|
students are mostly required to write a function/predicate that behaves
|
|
|
|
according to a specification (e.g. appends an item at the end of a list).
|
|
|
|
|
|
|
|
Due to this, we need to take the function submitted by a student and combine it
|
|
|
|
with a snippet of code that reads the standard input and calls the submitted
|
|
|
|
function. This could be achieved by setting the build command.
|
|
|
|
|
|
|
|
### Operating systems
|
|
|
|
|
|
|
|
The operating systems course requires students to work on a simple OS kernel
|
|
|
|
that is then run in a MIPS simulator called `msim`. There are various tests that
|
|
|
|
check if the student's implementation of core OS mechanisms is correct. These
|
|
|
|
tests are compiled into the kernel.
|
|
|
|
|
|
|
|
Each of these tests could be represented by a stage that compiles the kernel
|
|
|
|
with the test and then runs it against different configurations of `msim`.
|