You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

107 lines
4.5 KiB
Markdown

# Assignments
Assignments are programming tasks that can be tested by a worker after a user
submits their solution.
## Configuration format
An assignment is described by a YAML file that contains information on how to
build, run and test it. The testing process is divided into stages.
### Tests
A test is a pair of files, where the first one specifies an input and the other
one the expected output. Tests can be organized into groups for easier
organization and to facilitate more complex grading scenarios (e.g. "An
assignment passes only if it passes all tests in group X").
Tests are run by a special program called a *judge*, which compares the output
of the program to the expected output. By setting a judge, we can specify how
strict the testing is - for example, some assignments require the solution to
output exactly the same bytes as expected. Others permit any number of
whitespace characters between words of the output.
### Stages
A stage is a logical unit of the testing process. It specifies how to do a step
in the build process and how to test if it behaves correctly. The following is
contained in its configuration:
- **Name** - a unique string identifier of the stage
- **Build command** (optional) - used to prepare the submitted files for this
test stage
- **Test list** - specifies the tests (or test groups) to be run during this
stage
- **Test command** - used to run one specific test
- **Test input policy** - how to pass the test input to the program?
- redirect it to its standard input
- pass the name of the input file as an argument
- **Judge** - which judge should be used to evaluate the solution's output?
Custom judges can be supplied with the assignment.
- **Error policy** (optional) - what should we do when a test fails?
- **interrupt** the stage (default)
- **continue** with the next test
- **jump** to another stage (TODO cycle detection?)
- **Success policy** (optional) - what to do when all tests pass?
- **jump** to another stage (the next one by default)
- **end** the evaluation, even if there are still unprocessed stages
## Case study
We present some of the courses that might use ReCodEx to evaluate homework
assignments and outline the setup of the evaluation with respect to the concept
of stages.
### Simple programming exercises
For example introductory programming courses such as Programming I or
Programming in Java.
In the simplest case we only need one stage that builds the program and passes
the test inputs to its standard input. We will use the C language for this
example. The build command is `gcc source.c`, the test command is `./a.out`.
### Compiler principles
This course uses multiple tools in a pipeline-like fashion - for example `flex`
and `bison`.
We create a stage for each of the steps of this pipeline - we run flex and test
the output, then we run bison and do the same.
### XML technologies
In this course, students choose a topic they model using XML - for example a
library or a bulletin board. During the semester, they expand this project by
adding XSLT transformations, XQuery scripts, XPath queries, etc. These are
tested against fixed requirements (e.g. using some particular language
constructs).
This course already has a rather sophisticated application for testing homework
assignments, so we only include it for demonstration purposes.
Because every assignment focuses on a different technology, we would need a new
type of stage for each one. These stages would only run some checker programs
against the submitted sources (and possibly try to check their syntax etc.).
### Non-procedural programming
This course is different from other programming courses, because it only teaches
input/output manipulation by the end of the semester. In their assignments,
students are mostly required to write a function/predicate that behaves
according to a specification (e.g. appends an item at the end of a list).
Due to this, we need to take the function submitted by a student and combine it
with a snippet of code that reads the standard input and calls the submitted
function. This could be achieved by setting the build command.
### Operating systems
The operating systems course requires students to work on a simple OS kernel
that is then run in a MIPS simulator called `msim`. There are various tests that
check if the student's implementation of core OS mechanisms is correct. These
tests are compiled into the kernel.
Each of these tests could be represented by a stage that compiles the kernel
with the test and then runs it against different configurations of `msim`.