# Overall Architecture
## Components
## Communication
This section gives detailed overview about communication in ReCodEx solution. Basic concept is captured on following image:
![Communication Img](
Red connections are through ZeroMQ sockets, Blue are through WebSockets and Green are through HTTP. All ZeroMQ messages are sent as multipart with one string (command, option) per part, with no empty frames (unles explicitly specified otherwise).
### Internal worker communication
Communication between the two worker threads is split into two separate parts,
each one holding dedicated connection line. These internal lines are realized by
ZeroMQ inproc PAIR sockets. In this section we assume that the thread of the
worker which communicates with broker is called _listening thread_ and the other
one, which is evaluating incoming jobs is called _execution thread_. _Listening
thread_ is a server in both cases (the one who calls the `bind()` method), but
because of how ZeroMQ works, it's not very important (`connect()` call in
clients can precede server `bind()` call with no issue).
#### Main communication
Main communication is on `inproc://jobs` sockets. _Listening thread_ is waiting
for any messages (from broker, jobs and progress sockets) and passes incoming
requests to the _execution thread_, which handles them properly.
Commands from _listening thread_ to _execution thread_:
- **eval** - evaluate a job. Requires 3 message frames:
- `job_id` - identifier of this job (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
- `job_url` - URI location of archive with job configuration and submitted source code
- `result_url` - remote URI where results will be pushed to
Commands from _execution thread_ to _listening thread_:
- **done** - notifying of finished job. Requires 2 message frames:
- `job_id` - identifier of finished job
- `result` - response result, possible values below
- OK - everything ok
- FAILED - execution failed and cannot be reassigned to another worker (due to error in configuration for example)
- INTERNAL_ERROR - execution failed due to internal worker error, but other worker possibly can execute this without error
- `message` - non-empty error description if result was not "OK"
#### Progress callback
Progress messages are sent through `inproc://progress` sockets. This is only one way communication from _execution thread_ to the _listening thread_.
- **progress** - notice about evaluation progress. Requires 2 or 4 arguments:
- `job_id` - identifier of current job
- `state` - what is happening now.
- DOWNLOADED - submission successfuly fetched from fileserver
- FAILED - something bad happened and job was not executed at all
- UPLOADED - results are uploaded to fileserver
- STARTED - evaluation of tasks started
- ENDED - evaluation of tasks is finnished
- ABORTED - evaluation of job encountered internal error, job will be rescheduled to another worker
- FINISHED - whole execution is finished and worker ready for another job execution
- TASK - task state changed - see below
- `task_id` - only present for "TASK" state - identifier of task in current job
- `task_state` - only present for "TASK" state - result of task evaluation. One of "COMPLETED", "FAILED" and "SKIPPED".
- COMPLETED - task was successfully executed without any error, subsequent task will be executed
- FAILED - task ended up with some error, subsequent task will be skipped
- SKIPPED - some of the previous dependencies failed to execute, so this task wont be executed at all
### Broker - Worker communication
Broker is server when communicating with worker. IP address and port are configurable, protocol is TCP. Worker socket is DEALER, broker one is ROUTER type. Because of that, very first part of every (multipart) message from broker to worker must be target worker's socket identity (which is saved on it's **init** command).
Commands from broker to worker:
- **eval** - evaluate a job. See **eval** command in [[Communication#main-communication]]
- **intro** - introduce yourself to the broker (with **init** command) - this is
required when the broker loses track of the worker who sent the command.
Possible reasons for such event are e.g. that one of the communicating sides
shut down and restarted without the other side noticing.
- **pong** - reply to **ping** command, no arguments
Commands from worker to broker:
- **init** - introduce yourself to the broker. Useful on startup or after reestablishing lost connection. Requires at least 2 arguments:
- `hwgroup` - hardware group of this worker
- `header` - additional header describing worker capabilities. Format must be `header_name=value`, every header shall be in a separate message frame. There is no maximum limit on number of headers.
- **done** - job evaluation finished, see **done** command in [[Communication#main-communication]].
- **progress** - evaluation progress report, see **progress** command in [[Communication#progress-callback]]
- **ping** - tell broker I'm alive, no arguments
#### Heartbeating
It is important for the broker and workers to know if the other side is still
working (and connected). This is achieved with a simple heartbeating protocol.
The protocol requires the workers to send a **ping** command regularly (the
interval is configurable on both sides - future releases might let the worker
send its ping interval with the **init** command). Upon receiving a **ping**
command, the broker responds with **pong**.
Both sides keep track of missing heartbeating messages since the last one was
received. When this number reaches a threshold (called maximum liveness), the
other side is considered dead.
When the broker decides a worker died, it tries to reschedule its jobs to other
If a worker thinks the broker is dead, it tries to reconnect with a bounded,
exponentially increasing delay.
### Worker - File Server communication
Worker is communicating with file server only from _execution thread_. Supported is HTTP protocol optionally with SSL encryption (**recommended**, you can get free certificate from [Let's Encrypt]( if you haven't one yet). If supported by server and used version of libcurl, HTTP/2 standard is also available. File server should be set up to require basic HTTP authentication and worker is capable to send corresponding credentials with each request.
#### Worker point of view
Worker is cabable of 2 things - download file and upload file. Internally, worker is using libcurl C library with very similar setup. In both cases it can verify HTTPS certificate (on Linux against system cert list, on Windows against downloaded one from their website during installation), support basic HTTP authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error (returned HTTP status code is >= 400). Worker have list of credentials to all available file servers in it's config file.
- download file - standard HTTP GET request to given URL expecting content as response
- upload file - standard HTTP PUT request to given URL with file data as body - same as command line tool `curl` with option `--upload-file`
#### File server point of view
File server has it's internal directory structure, where all the files are stored. It provides REST API to get them or create new ones. File server doesn't provide authentication or secured connection by itself, but it's supposed to run file server as WSGI script inside a web server (like Apache) with proper configuration. For communication with worker are relevant these commands:
- **GET /submission_archives/\<id\>.\<ext\>** - gets an archive with submitted source code and corresponding configuration of this job evaluation
- **GET /tasks/\<hash\>** - gets a file, common usage is for input files or reference result files
- **PUT /results/\<id\>.\<ext\>** - upload archive with evaluation results under specified name (should be same _id_ as name of submission archive). On successful upload returns JSON `{ "result": "OK" }` as body of returned page.
If not specified otherwise, `zip` format of archives is used. Symbol `/` in API description is root of file server's domain. If the domain is for example `` with SSL support, getting input file for one task could look as GET request to ``.
### Broker - Monitor communication
Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
socket is same on both sides, ROUTER. Monitor is set as server in this
communication, its IP address and port are configurable in monitor's config
file. ZeroMQ socket ID (set on monitor's side) is "recodex-monitor" and must be
sent as first frame of every multipart message - see ZeroMQ ROUTER socket
documentation for more info.
Note that the monitor is designed so that it can receive data both from the
broker and workers. The current architecture prefers the broker to do all the
communication so that the workers don't have to know too many network services.
Monitor is treated as a somewhat optional part of whole solution, so no special
effort on communication realibility was made.
Commands from monitor to broker:
Because there is no need for the monitor to communicate with the broker, there
are no commands so far. Any message from monitor to broker is logged and
Commands from broker to monitor:
- **progress** - notification about progress with job evaluation. See [[Communication#progress-callback]] for more info.
### Broker - Frontend communication
Broker communicates with frontend through ZeroMQ connection over TCP. Socket
type on broker side is ROUTER, on frontend part it's REQ. Broker acts as a
server, its IP address and port is configurable in frontend.
Commands from frontend to broker:
- **eval** - evaluate a job. Requires at least 4 frames:
- `job_id` - identifier of this job (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
- `header` - additional header describing worker capabilities. Format must be `header_name=value`, every header shall be in a separate message frame. There is no maximum limit on number of headers. There may be also no headers at all.
- empty frame (with empty string)
- `job_url` - URI location of archive with job configuration and submitted source code
- `result_url` - remote URI where results will be pushed to
Commands from broker to frontend (all are responses to **eval** command):
- **accept** - broker is capable of routing request to a worker
- **reject** - broker can't handle this job (for example when the requirements
specified by the headers cannot be met). There are (rare) cases when the
broker finds that it cannot handle the job after it's been confirmed. In such
cases it uses the frontend REST API to mark the job as failed.
### File Server - Frontend communication
File server has a REST API for interaction with other parts of ReCodEx. Description communication with workers is in [[Communication#file-server-point-of-view]]. On top of that, there are other command for interaction with frontend:
- **GET /results/\<id\>.\<ext\>** - download archive with evaluated results of job _id_
- **POST /submissions/\<id\>** - upload new submission with identifier _id_. Expects that the body of the POST request uses file paths as keys and the content of the files as values. On successful upload returns JSON `{ "archive_path": <archive_url>, "result_path": <result_url> }` in response body. From _archive_path_ can be the submission downloaded (by worker) and corresponding evaluation results shouldbe uploaded to _result_path_.
- **POST /tasks** - upload new files, which will be available by names eqal to `sha1sum` of their content. There can be uploaded more files at once. On successful upload returns JSON `{ "result": "OK", "files": <file_list> }` in response body, where _file_list_ is dictionary of original file name as key and new URL with already hashed name as value.
There are no plans yet to support deleting files from this API. This may change in time.
### Monitor - Browser communication
Monitor interacts with browser through WebSocket connection. Monitor acts as server and browsers are connecting to it. IP address and port are also configurable. When client connects to the monitor, it sends a message with string representation of channel id (which messages are interested in, usually id of evaluating job). There can be at most one listener per channel, latter connection replaces previous one.
When monitor receives "progress" message from broker there are two options:
- there is no WebSocket connection for listed channel (job id) - message is dropped
- there is active WebSocket connection for listed channel - message is parsed into JSON format (see below) and send as string to browser. Messages for active connections are queued, so no messages are discarded even on heavy workload.
Message JSON format is dictionary with keys:
- **command** - type of progress.
- DOWNLOADED - submission successfuly fetched from fileserver
- FAILED - something bad happened and job was not executed at all
- UPLOADED - results are uploaded to fileserver
- STARTED - evaluation of tasks started
- ENDED - evaluation of tasks is finnished
- ABORTED - evaluation of job encountered internal error, job will be rescheduled to another worker
- FINISHED - whole execution is finished and worker ready for another job execution
- TASK - task state changed - see below
- **task_id** - id of currently evaluated task. Present only if **command** is "TASK".
- **task_state** - state of task with id **task_id**. Present only if **command** is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".
- COMPLETED - task was successfully executed without any error, subsequent task will be executed
- FAILED - task ended up with some error, subsequent task will be skipped
- SKIPPED - some of the previous dependencies failed to execute, so this task wont be executed at all
## Assignments
Assignments are programming tasks that can be tested by a worker after a user
submits their solution. An assignment is described by a YAML file that contains information on how to
build, run and test it. Following text requires knowledge of basic terminology used by ReCodEx. Please, check [separate page](Terminology).
### Basics
Job is a set/list of tasks (it is generally a set, but order of tasks have some meaning). These tasks may have dependencies (arbitrary number), which needs to be observed. When recodex-worker processes job, it creates a task graph, where tasks are vertices and dependencies are edges (A -> B means that the task A is on the dependency list of task B) and creates its linear ordering. The graph must be acyclic (otherwise linear ordering will not exist) and the recodex-worker attempts to execute maximal number of tasks possible. Tasks without dependencies can be executed directly, other tasks are executed when all their dependencies have been successfully completed.
Tasks are executed sequentially -- by the linear ordering of the task graph. Parallel tasks (tasks, which are not directly dependent and thus their linear ordering may be arbitrary) are ordered first by their priority (higher number => higher priority) and second by their order in the configuration file. Priority is important for specifying evaluation flow. See sample picture for better understanding.
![Picture of task serialization](
Each task has a unique ID (alphanum string like _CompileA_, _RunAA_, or _JudgeAB_ in the picture). These IDs are used to identify tasks (for dependency references, in the log, ...). Numbers in bottom right corner are priorities of each task. Higher number is greater priority. It means, that if task _RunAA_ is done, next must be _JudgeAA_ and not _RunAB_ (that will be also valid linear ordering, but _RunAB_ has lower priority).
### Task
Task is an atomic piece of work executed by recodex-worker. There are two basic types of tasks:
- **Execute external process** (optionally inside Isolate). Linux default is mandatory usage of isolate, this option is here because of Windows, where is currently no sandbox available.
- **Perform internal operation**. External processes are meant for compilation, testing, or execution of external judges. Internal operations comprise commands, which are typically related to file/directory maintenance and other evaluation management stuff. Few important examples:
- Create/delete/move/rename file/directory
- (un)zip/tar/gzip/bzip file(s)
- fetch a file from the file repository (either from worker cache or download it by HTTP GET or through SFTP).
Even though the internal operations may be handled by external executables (`mv`, `tar`, `pkzip`, `wget`, ...), it might be better to keep them inside the recodex-worker as it would simplify these operations and their portability among platforms. Furthermore, it is quite easy to implement them using common libraries (e.g., _zlib_, _curl_).
#### Internal tasks
**Archivate task** can be used for pack and compress a directory. Calling command is `archivate`. Requires two arguments:
- path and name of the directory to be archived
- path and name of the target archive. Only `.zip` format is supported.
**Extract task** is opposite to archivate task. It can extract different types of archives. Supported formats are the same as supports `libarchive` library (see [libarchive wiki](, mainly `zip`, `tar`, `tar.gz`, `tar.bz2` and `7zip`. Please note, that system administrator may not install all packages needed, so some formats may not work. Please, consult your system administrator for more information. Archives could contain only regular files or directories (ie. no symlinks, block and character devices sockets or pipes allowed). Calling command is `extract` and requires two arguments:
- path and name of the archive to extract
- directory, where the archive will be extracted
**Fetch task** will give you a file. It can be downloaded from remote file server or just copied from local cache if available. Calling comand is `fetch` with two arguments:
- name of the requested file without path
- path and name on the destination. Providing a different destination name can be used for easy rename.
**Copy task** can copy files and directories. Detailed info can be found on reference page of [boost::filesystem::copy]( Calling command is `cp` and require two arguments:
- path and name of source target
- path and name of destination targer
**Make directory task** can create arbitrary number of directories. Calling command is `mkdir` and requires at least one argument. For each provided one will be called [boost::filesystem::create_directories]( command.
**Rename task** will rename files and directories. Detailed bahavior can be found on reference page of [boost::filesystem::rename]( Calling command is `rename` and require two arguments:
- path and name of source target
- path and name of destination target
**Remove task** is for deleting files and directories. Calling command is `rm` and require at least one argument. For each provided one will be called [boost::filesystem::remove_all]( command.
#### External tasks
These tasks are typically executed in isolate (with given parameters) and the `recodex-worker` waits until they finish. The exit code determines, whether the task succeeded (0) or failed (anything else). A task may be marked as essential; in such case, failure will immediately cause termination of the whole job.
- **stdin** - can be configured to read from existing file or from `/dev/null`.
- **stdout** and **stderr** - can be individually redirected to a file or discarded. If this output options are specified, than it is possible to upload output files with results by copying them in result directory.
- **limits** - task have time and memory limits; if these limits are exceeded, the task also fails.
The task results (exit code, time, and memory consumption, etc.) are saved into result yaml file and sent back to frontend application to address which was specified on input.
#### Judges
Judges are treated as normal external command, so there is no special task for them. They should be used for comparision of outputted files from execution tasks and sample outputs. Results of this comparision should be at least information if files are same or not. Extension for this is percentual results based on similarity of given files.
All packed judges are adopted from old Codex with only very small modifications. ReCodEx judges base directory is in `${JUDGES_DIR}` variable, which can be used in job config file.
##### Judges interface
For future extensibility is **critical** that judges have some shared **interface** of calling and return values.
- Parameters: There are two mandatory positional parameters which has to be files for comparision
- Results:
- _everything OK_
- exitcode: 0
- stdout: there is one line with double value which should be percentage of similarity of two given files
- _error during execution_
- exitcode: 1
- stderr: there should be description of error
##### ReCodEx judges
Below is list of judges which is packed with ReCodEx project and comply above requirements.
**recodex-judge-normal** is base judge used by most of exercises. This judge compares two text files. It compares only text tokens regardless amount of whitespace between them.
Usage: recodex-judge-normal [-r | -n | -rn] <file1> <file2>
- file1 and file2 are paths to files that will be compared
- switch options `-r` and `-n` can be specified as a 1st optional argument.
- `-n` judge will treat newlines as ordinary whitespace (it will ignore line breaking)
- `-r` judge will treat tokens as real numbers and compares them accordingly (with some amount of error)
**recodex-judge-filter** can be used for preprocess output files before real judging. This judge filters C-like comments from a text file. The comment starts with double slash sequence (`//`) and finishes with newline. If the comment takes whole line, then whole line is filtered.
Usage: recodex-judge-filter [inputFile [outputFile]]
- if `outputFile` is ommited, std. output is used instead.
- if both files are ommited, application uses std. input and output.
**recodex-judge-shuffle** is for judging shuffled files. This judge compares two text files and returns 0 if they matches (and 1 otherwise). Two files are compared with no regards for whitespace (whitespace acts just like token delimiter).
Usage: recodex-judge-shuffle [-[n][i][r]] <file1> <file2>
- `-n` ignore newlines (newline is considered only a whitespace)
- `-i` ignore items order on the row (tokens on each row may be permutated)
- `-r` ignore order of rows (rows may be permutated); this option has no effect when `-n` is used
### Job configuration
Configuration of the job which is passed to worker is generated on demand by web API. Each job has unique one.
#### Configuration items
Mandatory items are bold, optional italic.
- **submission** - information about this particular submission
- **job-id** - textual ID which should be unique in whole recodex
- **language** - no specific function, just for debugging and clarity
- **file-collector** - address from which fetch tasks will download data
- _log_ - default is false, can be omitted, determines whether job execution will be logged into one shared log
- **tasks** - list (not map) of individual tasks
- **task-id** - unique indetifier of task in scope of one submission
- **priority** - higher number, higher priority
- **fatal-failure** - if true, than execution of whole job will be stopped after failing of this one
- **dependencies** - list of dependencies which have to be fulfilled before this task, can be omitted if there is no dependencies
- **cmd** - description of command which will be executed
- **bin** - the binary itself (full path of external command or name of internal task)
- _args_ - list of arguments which will be sent into execution unit
- _test-id_ - ID of the test this task is part of - must be specified for tasks which the particular test's result depends on
- _type_ - type of the task, can be omitted, default value is _inner_ - possible values are: _inner_, _initialisation_, _execution_, _evaluation_
- _sandbox_ - wrapper for external tasks which will run in sandbox, if defined task is automatically external
- **name** - name of used sandbox
- _stdin_ - file to which standard input will be redirected, can be omitted
- _stdout_ - file to which standard output will be redirected, can be omitted
- _stderr_ - file to which error output will be redirected, can be omitted
- **limits** - list of limits which can be passed to sandbox
- **hw-group-id** - determines specific limits for specific machines
- _time_ - time of execution in second
- _wall-time_ - wall time in seconds
- _extra-time_ - extra time which will be added to execution
- _stack-size_ - size of stack of executed program in kilobytes
- _memory_ - overall memory limit for application in kilobytes
- _parallel_ - integral number of processes which can run simultaneously, time and memory limits are merged from all potential processes/threads
- _disk-size_ - size of all io operations from/to files in kilobytes
- _disk-files_ - number of files which can be opened
- _environ-variable_ - wrapper for map of environmental variables, union with default worker configuration
- _chdir_ - this will be working directory of executed application
- _bound-directories_ - list of structures reprezenting directories which will be visible inside sandbox, union with default worker configuration
- **src** - source pointing to actual system directory
- **dst** - destination inside sandbox which can have its own filesystem binding
- **mode** - determines connection mode of specified directory, one of values: RW, NOEXEC, FS, MAYBE, DEV
#### Configuration example
This configuration example is written in YAML and serves only for demostration purposes. Therefore it is not working example which can be used in real traffic. Some items can be omitted and defaults will be used.
--- # only one document which contains job, aka. list of tasks and some general infos
submission: # happy hippoes fence
job-id: hippoes
language: c
file-collector: http://localhost:9999/tasks
log: true
- task-id: "compilation"
priority: 2
fatal-failure: true
bin: "/usr/bin/gcc"
- "solution.c"
- "-o"
- "a.out"
name: "isolate"
- hw-group-id: group1
parallel: 0
chdir: ${EVAL_DIR}
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
- task-id: "fetch_test_1"
priority: 4
fatal-failure: false
- compilation
bin: "fetch"
- ""
- "${SOURCE_DIR}/"
- task-id: "evaluation_test_1"
priority: 5
fatal-failure: false
- fetch_test_1
bin: "a.out"
name: "isolate"
- hw-group-id: group1
time: 0.5
memory: 8192
chdir: ${EVAL_DIR}
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
- task-id: "fetch_test_solution_1"
priority: 6
fatal-failure: false
- evaluation_test_1
bin: "fetch"
- "1.out"
- "${SOURCE_DIR}/1.out"
- task-id: "judging_test_1"
priority: 7
fatal-failure: false
- fetch_test_solution_1
bin: "${JUDGES_DIR}/recodex-judge-normal"
- "1.out"
- "plot.out"
name: "isolate"
- hw-group-id: group1
parallel: 0
chdir: ${EVAL_DIR}
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
- task-id: "rm_junk_test_1"
priority: 8
fatal-failure: false
- judging_test_1
bin: "rm"
- "${SOURCE_DIR}/"
- "${SOURCE_DIR}/plot.out"
- "${SOURCE_DIR}/1.out"
### Job variables
Because frontend does not know which worker gets the job, its necessary to be a little general in configuration file. This means that some worker specific things has to be transparent. Good example of this is directories, which can be placed whenever worker wants. In case of this variables were established. There are of course some restrictions where variables can be used. Basically whenever filesystem paths can be used, variables can be used.
Usage of variables in configuration is then simple and kind of shell-like. Name of variable is put inside braces which are preceded with dollar sign. Real usage is than something like this: ${VAR}. There should be no quotes or apostrophies around variable name, just simple text in braces. Parsing is simple and whenever there is dollar sign with braces job execution unit automatically assumes that this is a variable, so there is no chance to have this kind of substring.
List of usable variables in job configuration:
- **WORKER_ID** - integral identification of worker, unique on server
- **JOB_ID** - identification of this job
- **SOURCE_DIR** - directory where source codes of job are stored
- **EVAL_DIR** - evaluation directory which should point inside sandbox. Note, that some existing directory must be bound inside sanbox under **EVAL_DIR** name using _bound-directories_ directive inside limits section.
- **RESULT_DIR** - results from job can be copied here, but only with internal task
- **TEMP_DIR** - general temp directory which is not dependent on operating system
- **JUDGES_DIR** - directory in which judges are stored (outside sandbox)
### Directories and Files
For each job execution unique directory structure is created. Job is not restricted to specified directories (tasks can do whatever is allowed on system), but it is advised to use them inside job. DEFAULT variable represents worker's working directory specified in each one's configuration. No variable of this name is defined for use in job YAML configuration.
Inside this directory temporary files for job execution are created:
- **${DEFAULT}/downloads/${WORKER_ID}/${JOB_ID}** - where the downloaded archive is saved
- **${DEFAULT}/submission/${WORKER_ID}/${JOB_ID}** - decompressed submission is stored here
- **${DEFAULT}/eval/${WORKER_ID}/${JOB_ID}** - this directory is accessible in job configuration using variables and all execution should happen here
- **${DEFAULT}/temp/${WORKER_ID}/${JOB_ID}** - directory where all sort of temporary files can be stored
- **${DEFAULT}/results/${WORKER_ID}/${JOB_ID}** - again accessible directory from job configuration which is used to store all files which will be upload on fileserver, usually there will be only yaml result file and optionally log, every other file has to be copied here explicitly from job
### Results
Results of tasks are sent back in YAML format compressed into archive. This archive can contain further files, such as job logging information and files which were explicitly copied into results directory.
Results file contains job identification and results of individual tasks.
#### Results items
Mandatory items are bold, optional italic.
- **job-id** - identification of job to which this results belongs
- _error_message_ - present only if whole execution failed and none of tasks were executed
- **results** - list of tasks results
- **task-id** - unique identification of task in scope of this job
- **status** - three states: OK, FAILED, SKIPPED
- _error_message_ - defined only in internal tasks on failure
- _sandbox_results_ - if defined than this task was external and was run in sandbox
- **exitcode** - integer which executed program gave on exit
- **time** - time in seconds in which program exited
- **wall-time** - wall time in seconds
- **memory** - how much memory program used in kilobytes
- **max-rss** - maximum resident set size used in kilobytes
- **status** - two letter status code: OK, RE, SG, TO, XX
- **exitsig** - description of exit signal
- **killed** - boolean determining if program exited correctly or was killed
- **message** - status message on failure
#### Example result file
--- # only one document which contains list of results
job-id: 5
- task-id: compile1
exitcode: 0
time: 5 # in seconds
wall-time: 5 # in seconds
memory: 50000 # in KB
max-rss: 50000
status: RE # two letter status code: OK, RE, SG, TO, XX
exitsig: 1
killed: true
message: "Time limit exceeded" # status message
- task-id: eval1
status: FAILED
error_message: "Task failed, something very bad happend!"
### Scoring
Every assignment consists of tasks. Only some tasks however are part of the evaluation. Those evaluated tasks are grouped into **tests**. Each task might be assigned a _test-id_ parameter, as described above. Every test must consist of at least two tasks: execution and evaluation by a judge. The former retrieves information about the execution such as elapsed time and memory consumed, the latter result with a score - float between 0 and 1.
Total resulting score of the assignment submission is then calculated according to a supplied score config (described below). Total score is also a float between 0 and 1. This number is then multiplied by the maximum of points awarded for the assignment by the teacher assigning the exercise - not the assignment author.
#### Simple score calculation
At the first stage of development, simple score calculation is used. This will most probably be replaced by more advanced score calculation algorithm in near future.
Simple score calculation just looks at the score of each test. In the score config, author of the assignment must specify weights of each test. Resulting score is calculated as a sum of products of score and weight of each test divided by the sum of all weights. The algorithm in Python would look something like this:
sum = weightSum = 0
for t in tests:
sum += t.score * t.weight
weightSum += t.weight
score = sum / weightSum
Sample score config in YAML format:
a: 300 # test with id 'a' has a weight of 300
b: 200
c: 100
d: 100
#### Logs
During execution tasks can use only one shared log. There is no use for multiple logs which will be used in all tasks, because of pretty small amount of information which is loged. Log is in default disabled and can be enabled in job configuration, then all logged actions in tasks will be visible here.
After execution is log packed and sent back to fileserver where can be further processed.
### Case study
We present some of the courses that might use ReCodEx to evaluate homework
assignments and outline the setup of the evaluation with respect to the concept
of stages.
#### Simple programming exercises
For example introductory programming courses such as Programming I or Java
In the simplest case we only need one stage that builds the program and passes
the test inputs to its standard input. We will use the C language for this
example. The build command is `gcc source.c`, the test command is `./a.out`.
#### Compiler principles
This course uses multiple tools in a pipeline-like fashion - for example `flex`
and `bison`.
We create a stage for each of the steps of this pipeline - we run flex and test
the output, then we run bison and do the same.
#### XML technologies
In this course, students choose a topic they model using XML - for example a
library or a bulletin board. During the semester, they expand this project by
adding XSLT transformations, XQuery scripts, XPath queries, etc. These are
tested against fixed requirements (e.g. using some particular language
This course already has a rather sophisticated application for testing homework
assignments, so we only include it for demonstration purposes.
Because every assignment focuses on a different technology, we would need a new
type of stage for each one. These stages would only run some checker programs
against the submitted sources (and possibly try to check their syntax etc.).
#### Non-procedural programming
This course is different from other programming courses, because it only teaches
input/output manipulation by the end of the semester. In their assignments,
students are mostly required to write a function/predicate that behaves
according to a specification (e.g. appends an item at the end of a list).
Due to this, we need to take the function submitted by a student and combine it
with a snippet of code that reads the standard input and calls the submitted
function. This could be achieved by setting the build command.
#### Operating systems
The operating systems course requires students to work on a simple OS kernel
that is then run in a MIPS simulator called `msim`. There are various tests that
check if the student's implementation of core OS mechanisms is correct. These
tests are compiled into the kernel.
Each of these tests could be represented by a stage that compiles the kernel
with the test and then runs it against different configurations of `msim`.
## Coding style
Every project should have some consistent coding style in which all contributors write. Bellow you can find our conventions on which we agreed on and which we try to keep.
### C++
**NOTE, that C++ projects have set code linter (`cmake-format`) with custom format. To reformat code run `make format` inside `build` directory of the project (probably not working on Windows).** For quick introduction into our format, see following paragraphs.
In C++ is written worker and queue manager. Generally its used underscore style with all small letters. Inspired by [Google C++ style guide]( If something is not defined than naming/formatting can be arbitrary, but should be similar to bellow-defined behaviour.
#### Naming convention
* For source codes use all lower case with underscores not dashes. Header files should end with `.h` and C++ files with `.cpp`.
* Typenames are all in lower case with underscores between words. This is applicable to classes, structs, typedefs, enums and type template parameters.
* Variable names can be divided on local variables and class members. Local variables are all lower case with underscores between words. Class members have in addition trailing underscore on the end (struct data members do not have underscore on the end).
* Constants are just like any other variables and do not have any specifics.
* All function names are again all lower case with underscores between words.
* Namespaces if there are ones they should have lower case and underscores.
* Macros are classical and should have all capitals and underscores.
* Comments can be two types documentational and ordinery ones in code. Documentation should start with `/**` and end with `*/`, convention inside them is javadoc documentation format. Classical comments in code are one liners which starts with `//` and end with the end of the line.
#### Formatting convention
* Line length is not explicitly defined, but should be reasonable.
* All files should use UTF-8 character set.
* For code indentation tabs (`\t`) are used.
* Function declaration/definition: return type should be on the same line as the rest of the declaration, if line is too long, than particular parameters are placed on new line. Opening parenthesis of function should be placed on new line bellow declaration. Its possible to write small function which can be on only one line. Between parameter and comma should be one space.
int run(int id, string msg);
void print_hello_world()
std::cout << "Hello world" << std::endl;
int get_five() { return 5; }
* Lambda expressions: same formatting as classical functions
auto hello = [](int x) { std::cout << "hello_" << x << std::endl; }
* Function calls: basically same as function header definition.
* Condition: after if, or else there always have to be one space in front of opening bracket and again one space after closing condition bracket (and in front of opening parenthesis). If and else always should be on separate lines. Inside condition there should not be any pointless spaces.
if (x == 5) {
std::cout << "Exactly five!" << std::endl;
} else if (x < 5 && y > 5) {
std::cout << "Whoa, that is weird format!" << std::endl;
} else {
std::cout << "I dont know what is this!" << std::endl;
* For and while cycles: basically same rules as for if condition.
* Try-catch blocks: again same rules as for if conditions. Closing parentheses of try block should be on the same line as catch block.
try {
int a = 5 / 0;
} catch (...) {
std::cout << "Division by zero" << std::endl;
* Switch: again basics are the same as for if condition. Case statements should not be indented and case body should be intended with 1 tab.
switch (switched) {
case 0: // no tab indent
... // 1 tab indent
case 1:
* Pointers and references: no spaces between period or arrow in accessing type member. No spaces after asterisk or ampersand. In declaration of pointer or reference format should be that asterisk or ampersand is adjacent to name of the variable not type.
number = *ptr;
ptr = &val;
number = ptr->number;
number = val_ref.number;
int *i;
int &j;
// bad format bellow
int* i;
int * i;
* Boolean expression: long boolean expression should be divided into more lines. The division point should always be after logical operators.
if (i > 10 &&
j < 10 &&
k > 20) {
std::cout << "Were here!" << std::endl;
* Return values should not be generally wrapped with parentheses, only if needed.
* Preprocessor directives start with `#` and always should start at the beginning of the line.
* Classes: sections aka. public, protected, private should have same indentation as the class start itself. Opening parenthesis of class should be on the same line as class name.
class my_class {
void class_function();
int class_member_;
* Operators: around all binary operators there always should be spaces.
int x = 5;
x = x * 5 / 5;
x = x + 5 * (10 - 5);
### Python
Python code should correspond to [PEP 8]( style.
### PHP
