You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
227 lines
13 KiB
Markdown
227 lines
13 KiB
Markdown
# Assignments overview
|
|
|
|
Assignments are programming tasks that can be tested by a worker after a user
|
|
submits their solution. An assignment is described by a YAML file that contains information on how to
|
|
build, run and test it.
|
|
|
|
### Terminology
|
|
Following text requires knowledge of basic terminology used by ReCodEx. Please, check [separate page](https://github.com/ReCodEx/GlobalWiki/wiki/Terminology).
|
|
|
|
### Basics
|
|
Job is a set/list of tasks (it is generally a set, but order of tasks have some meaning). These tasks may have dependencies (arbitrary number), which needs to be observed. When isoeval processes job, it creates a task graph, where tasks are vertices and dependencies are edges (A -> B means that the task B is on the dependency list of task A) and creates it linear ordering. The graph must be acyclic (otherwise linear ordering will not exist) and the isoeval attempts to execute maximal number of tasks possible. Tasks without dependencies can be executed directly, other tasks are executed when all their dependencies have been successfully completed.
|
|
|
|
Tasks are executed sequentially -- by the linear ordering of the task graph. Parallel tasks (tasks, which are not directly dependent and thus their linear ordering may be arbitrary) are ordered by their priority (first) and by their order in the configuration file (second). Priority is important for specifying evaluation flow. See sample picture for better understanding.
|
|
|
|
![Picture of task serialization](https://github.com/ReCodEx/GlobalWiki/raw/master/images/Assignment_overview.png)
|
|
|
|
Each task has a unique ID (alphanum string like _CompileA_, _RunAA_, or _JudgeAB_ in the picture). These IDs are used to identify tasks (for dependency references, in the log, ...). Numbers in bottom right corner are priorities of each task. Higher number is greater priority. It means, that if task _RunAA_ is done, next must be _JudgeAA_ and not _RunAB_ (that will be also valid linear ordering, but _RunAB_ has lower priority).
|
|
|
|
### Task
|
|
Task is an atomic piece of work executed by isoeval. There are two basic types of tasks (so far):
|
|
- **Execute external process** (optionally inside Isolate). Linux default will be mandatory in Isolate, this option is here because of Windows.
|
|
- **Perform internal operation**. External processes are meant for compilation, testing, or execution of external judges. Internal operations comprise commands, which are typically related to file/directory maintenance and other evaluation management stuff. Few important examples:
|
|
- Create/delete/move/rename file/directory
|
|
- (un)zip/tar/gzip/bzip file(s)
|
|
- fetch a file from the file repository (either from worker cache or download it by HTTP GET or through SFTP).
|
|
|
|
Even though the internal operations may be handled by external executables (`mv`, `tar`, `pkzip`, `wget`, ...), it might be better to keep them inside the isoeval as it would simplify these operations and their portability among platforms. Furthermore, it is quite easy to implement them using common libraries (e.g., _zlib_, _curl_).
|
|
|
|
**External Tasks**
|
|
(some of the properties specified here may also apply for internal tasks -- needs to be determined later)
|
|
|
|
These tasks are typically executed in isolate (with given parameters) and the isoeval waits until they finish. The exit code determines, whether the task succeeded (0) or failed (anything else). A task may be marked as essential; in such case, failure will immediately cause termination of the whole job.
|
|
|
|
- **stdin** - can be configured to read from existing file or from /dev/null.
|
|
- **stdout** and **stderr** - can be individually redirected to a file or discarded. Optionally, a copy can be directed to selected log (for example common log for compilations). In any case, outputs of all tasks are saved in external files (inside log directory) and optionally included into the job output (see Directories).
|
|
- **limits** - task have time and memory limits; if these limits are exceeded, the task also fails. Additionally, a second memory/time limit may be provided (for the Isolate) -- then the first limits are "soft limits" (used only to determine, whether the task succeeded) and the second limits are hard limits (really kills the process).
|
|
|
|
The task results (exit code, time, and memory consumption) are save into parameter global structure (see Parameters And Results).
|
|
|
|
### Directories and Files
|
|
The isoeval job is restricted to operate on several subdirs; each path used in task configuration must start with identifier of one of these dirs and no '..' are allowed in paths.
|
|
- **input** - where the input files (source codes) are prepared
|
|
- **output** - anything that is moved/copied to this dir is taken as output of the job and sent back to frontend (where it is stored)
|
|
- **box** - empty dir which can be used for compilation/evaluation/... (an internal task for cleaning this box may exist). **Every test needs to have separate subfolder here to avoid sharing data between tests.**
|
|
- **log** - directory, where log files are and where outputs of all tasks are copied. Each task has a directory here with the same name with _stdout_ and _stderr_ files. Tasks cannot access **log** directory, except they can produce or redirect output to logs.
|
|
|
|
### Configuration
|
|
Configuration of the job which is passed to worker is generated from two parts:
|
|
- **template** - Common template for similar kinds of tasks. Contains allmost all instructions - when fetch, move, rename files, run commands, judges, ..., task dependencies and priorities. This template can be shared by more problem assignments or every problem (probably in compiller class) can have different one.
|
|
- **isoeval config** - includes data for instancioning the template, e.q. input file names, ...
|
|
|
|
Final configuration for worker is computer generated from those two configs.
|
|
|
|
#### Configuration example
|
|
This configuration example is written in YAML and serves only for demostration purposes. Therefore it is not working example which can be used in real traffic. Some items can be omitted and defaults will be used.
|
|
```
|
|
--- # only one document which contains job, aka. list of tasks and some general infos
|
|
submission: # information about this particular submission
|
|
job-id: eval_5
|
|
language: "cpp"
|
|
file-collector: "http://localhost:36587"
|
|
tasks:
|
|
- task-id: "fetch_input"
|
|
priority: 2
|
|
fatal-failure: true
|
|
cmd:
|
|
bin: "fetch"
|
|
args:
|
|
- "94549d889ae96210ff2a73bd0a5bbe3185f05ff6"
|
|
- "01.in"
|
|
- task-id: "move_test01"
|
|
priority: 3
|
|
fatal-failure: true
|
|
dependencies: # can be omitted if there is no dependencies
|
|
- compile_test01
|
|
cmd:
|
|
bin: "mv"
|
|
args:
|
|
- "recodex.cpp"
|
|
- "/tmp/isoeval/1/eval_5/recodex.cpp"
|
|
- task-id: "eval_test01"
|
|
priority: 4
|
|
fatal-failure: false
|
|
dependencies:
|
|
- move_test01
|
|
cmd:
|
|
bin: "recodex"
|
|
args:
|
|
- "-v"
|
|
- "-f 01.in"
|
|
stdin: "01.in" # can be omitted if there is no binding to stdin
|
|
stdout: "01.out" # can be omitted if there is no binding to stdout
|
|
stderr: "01.err" # can be omitted if there is no binding to stderr
|
|
sandbox: # if defined task is external and will be run in sandbox
|
|
name: "isolate" # mandatory information
|
|
limits: # if not defined, then worker default configuration of limits is loaded
|
|
# anything of the specified limits can be omitted and will be loaded from worker defaults
|
|
- hw-group-id: group1 # determines specific limits for specific machines
|
|
time: 5 # seconds
|
|
wall-time: 6 # seconds
|
|
extra-time: 2 # seconds
|
|
stack-size: 50000 # KB
|
|
memory: 50000 # in KB
|
|
parallel: false # time and memory limits are merged from all potential processes/threads
|
|
disk-blocks: 50
|
|
disk-inodes: 5
|
|
environ-variable:
|
|
ISOLATE_BOX: "/box"
|
|
ISOLATE_TMP: "/tmp"
|
|
chdir: /evaluate
|
|
bound-directories:
|
|
/tmp/isoeval/eval_5: /evaluate
|
|
- hw-group-id: group2 # determines specific limits for specific machines
|
|
time: 6 # seconds
|
|
wall-time: 7 # seconds
|
|
extra-time: 3 # seconds
|
|
memory: 60000 # in KB
|
|
parallel: false # time and memory limits are merged from all potential processes/threads
|
|
disk-blocks: 50
|
|
disk-inodes: 5
|
|
...
|
|
```
|
|
|
|
### Parameters And Results
|
|
The job may have some input parameters (e.g., default config for Isolate, global parameters for the tested processes, ...). Similarly, the job has some structured results -- for each task (where applicable), it gathers exit code and consumed time and memory.
|
|
|
|
These parameters are stored in global, structured parameter space. I would suggest something that would map easily on JSON, for instance -- i.e., something that supports structures (named collections), arrays (ordered collections), and basic numeric and string values.
|
|
|
|
Input parameters have two sources, some defaults are present in the configuration of the worker, another set is provided in the configuration of the job. These sets are merged, job config has a priority.
|
|
|
|
Parameters are only read by the tasks (they can be used in task parameters). Some simple syntax needs to be used for evaluation of parameter expressions -- e.g., ("${params.tests[1].memoryLimit}"). _Parameters should be stored in worker's global namespace. Task configuration can make references to this structure. Validity should be checked before executing first task from the job. In this structure is only writable section "results" - here are written achieved memory and time limits of each task. Whole structure is send to WebApp with all logs._
|
|
_**TODO:** analysis required -- how complex expressions do we really need_
|
|
|
|
#### Example result file
|
|
|
|
```
|
|
--- # only one document which contains job, aka. list of tasks and some general infos
|
|
job-id: 5
|
|
results:
|
|
- task-id: compile1
|
|
status: OK # OK, FAILED
|
|
sandbox_results:
|
|
exitcode: 0
|
|
time: 5 # in seconds
|
|
wall-time: 5 # in seconds
|
|
memory: 50000 # in KB
|
|
max-rss: 50000
|
|
status: RE # two letter status code: OK, RE, SG, TO, XX
|
|
exitsig: 1
|
|
killed:
|
|
message: "Time limit exceeded" # status message
|
|
- task-id: eval1
|
|
status: FAILED
|
|
error_message: "Task failed, something very bad happend!"
|
|
exitcode: 0
|
|
.
|
|
.
|
|
.
|
|
...
|
|
```
|
|
|
|
### Logs
|
|
There is one general (mandatory) log, where the job progress is logged. Each row corresponds to one task and it holds only the task name, task exit code (or some other indication whether the task ended OK or not), and optionally things like consumed memory and time.
|
|
|
|
Other logs (stored in log dir) can be created. They do not have to be declared in advance, but they are specified at each task (if its output is going to a log) and created once some task produces an output that goes to the log.
|
|
|
|
|
|
|
|
## Case study
|
|
|
|
We present some of the courses that might use ReCodEx to evaluate homework
|
|
assignments and outline the setup of the evaluation with respect to the concept
|
|
of stages.
|
|
|
|
### Simple programming exercises
|
|
|
|
For example introductory programming courses such as Programming I or Java
|
|
programming.
|
|
|
|
In the simplest case we only need one stage that builds the program and passes
|
|
the test inputs to its standard input. We will use the C language for this
|
|
example. The build command is `gcc source.c`, the test command is `./a.out`.
|
|
|
|
### Compiler principles
|
|
|
|
This course uses multiple tools in a pipeline-like fashion - for example `flex`
|
|
and `bison`.
|
|
|
|
We create a stage for each of the steps of this pipeline - we run flex and test
|
|
the output, then we run bison and do the same.
|
|
|
|
### XML technologies
|
|
|
|
In this course, students choose a topic they model using XML - for example a
|
|
library or a bulletin board. During the semester, they expand this project by
|
|
adding XSLT transformations, XQuery scripts, XPath queries, etc. These are
|
|
tested against fixed requirements (e.g. using some particular language
|
|
constructs).
|
|
|
|
This course already has a rather sophisticated application for testing homework
|
|
assignments, so we only include it for demonstration purposes.
|
|
|
|
Because every assignment focuses on a different technology, we would need a new
|
|
type of stage for each one. These stages would only run some checker programs
|
|
against the submitted sources (and possibly try to check their syntax etc.).
|
|
|
|
### Non-procedural programming
|
|
|
|
This course is different from other programming courses, because it only teaches
|
|
input/output manipulation by the end of the semester. In their assignments,
|
|
students are mostly required to write a function/predicate that behaves
|
|
according to a specification (e.g. appends an item at the end of a list).
|
|
|
|
Due to this, we need to take the function submitted by a student and combine it
|
|
with a snippet of code that reads the standard input and calls the submitted
|
|
function. This could be achieved by setting the build command.
|
|
|
|
### Operating systems
|
|
|
|
The operating systems course requires students to work on a simple OS kernel
|
|
that is then run in a MIPS simulator called `msim`. There are various tests that
|
|
check if the student's implementation of core OS mechanisms is correct. These
|
|
tests are compiled into the kernel.
|
|
|
|
Each of these tests could be represented by a stage that compiles the kernel
|
|
with the test and then runs it against different configurations of `msim`.
|