Job configuration update

master
Petr Stefan 8 years ago
parent bf77af9fad
commit 167422a9b3

@ -1,40 +1,39 @@
# Job configuration # Job configuration
Following description will cover configuration as seen from API point of view, Following description will cover configuration as seen from the API point of
worker may have some optional items mandatory (they are filled by API view, worker may have some optional items mandatory (they are filled by API
automatically). Bold items in lists describing the values are mandatory, italic automatically). Bold items in lists describing the values are mandatory, italic
ones are optional. ones are optional.
## Variables ## Variables
Because frontend does not know which worker gets the job, its necessary to be a Because the frontend does not know which worker gets the job, its necessary to
little general in configuration file. This means that some worker specific be a little general in a configuration file. This means that some worker
things has to be transparent. Good example of this is that some (evaluation) specific things has to be transparent. Good example of this is that some
directories may be placed differently across all workers. To provide a solution, (evaluation) directories may be placed differently across all workers. To
variables were established. There are of course some restrictions where provide a solution, variables are established. There are some restrictions where
variables can be used. Basically whenever filesystem paths can be used, variables can be used, basically whenever filesystem paths can be used.
variables can be used.
Usage of variables in the configuration is simple and kind of shell-like. Name
Usage of variables in configuration is simple and kind of shell-like. Name of of a variable is put inside braces which are preceded with dollar sign. Real
variable is put inside braces which are preceded with dollar sign. Real usage is usage is then something like this: `${VAR}`. There should be no quotes or
then something like this: ${VAR}. There should be no quotes or apostrophes apostrophes around variable name, just simple text in braces. Parsing is simple
around variable name, just simple text in braces. Parsing is simple and whenever and whenever there is dollar sign with braces job execution unit automatically
there is dollar sign with braces job execution unit automatically assumes that assumes that this is a variable, so there is no chance to have this kind of
this is a variable, so there is no chance to have this kind of substring substring anywhere else.
anywhere else.
List of usable variables in a job configuration:
List of usable variables in job configuration:
- **WORKER_ID** -- integral identification of worker, unique on a server
- **WORKER_ID** -- integral identification of worker, unique on server
- **JOB_ID** -- identification of this job - **JOB_ID** -- identification of this job
- **SOURCE_DIR** -- directory where source codes of job are stored - **SOURCE_DIR** -- directory where source codes of the job are stored
- **EVAL_DIR** -- evaluation directory which should point inside sandbox. Note, - **EVAL_DIR** -- evaluation directory which should point inside the sandbox.
that some existing directory must be bound inside sanbox under **EVAL_DIR** Note, that some existing directory must be bound inside sanbox under
name using _bound-directories_ directive inside limits section. **EVAL_DIR** name using _bound-directories_ directive inside limits section.
- **RESULT_DIR** -- results from job can be copied here, but only with internal - **RESULT_DIR** -- results from the job can be copied here, but only with
task internal copy task
- **TEMP_DIR** -- general temp directory which is not dependent on operating - **TEMP_DIR** -- general temporary directory which is not dependent on
system operating system
- **JUDGES_DIR** -- directory in which judges are stored (outside sandbox) - **JUDGES_DIR** -- directory in which judges are stored (outside sandbox)
## Tasks ## Tasks
@ -42,10 +41,10 @@ List of usable variables in job configuration:
Task is an atomic piece of work executed by worker. There are two basic types of Task is an atomic piece of work executed by worker. There are two basic types of
tasks: tasks:
- **Execute external process** (optionally inside Isolate). External processes - **Execute external process** (optionally inside isolate). External processes
are meant for compilation, testing, or execution of external judges. Linux are meant for compilation, testing, or execution of judges. Linux default is
default is mandatory usage of isolate sandbox, this option is present because mandatory usage of isolate sandbox, this option is present because of Windows,
of Windows, where is currently no sandbox available. where is currently no sandbox available.
- **Perform internal operation**. Internal operations comprise commands, which - **Perform internal operation**. Internal operations comprise commands, which
are typically related to file/directory maintenance and other evaluation are typically related to file/directory maintenance and other evaluation
management stuff. Few important examples: management stuff. Few important examples:
@ -63,53 +62,53 @@ libraries (e.g., _zlib_, _curl_).
A task may be marked as essential; in such case, failure will immediately cause A task may be marked as essential; in such case, failure will immediately cause
termination of the whole job. Nice example usage is task with program termination of the whole job. Nice example usage is task with program
compilation. Without success it is obvious, that the job is broken and every compilation. Without success it is obvious, that the job is broken and every
test will fail. test will fail anyway.
### Internal tasks ### Internal tasks
- **Archivate task** can be used for pack and compress a directory. Calling - **Archivate task** can be used for packing and compressing a directory.
command is `archivate`. Requires two arguments: Calling command is `archivate`. Requires two arguments:
- path and name of the directory to be archived - path and name of the directory to be archived
- path and name of the target archive. Only `.zip` format is supported. - path and name of the target archive. Only `.zip` format is supported.
- **Extract task** is opposite to archivate task. It can extract different types - **Extract task** is opposite to archivate task. It can extract different types
of archives. Supported formats are the same as supports `libarchive` library of archives. Supported formats are the same as supports `libarchive` library
(see [libarchive wiki](https://github.com/libarchive/libarchive/wiki)), mainly (see [libarchive wiki](https://github.com/libarchive/libarchive/wiki)),
`zip`, `tar`, `tar.gz`, `tar.bz2` and `7zip`. Please note, that system mainly `zip`, `tar`, `tar.gz`, `tar.bz2` and `7zip`. Please note, that system
administrator may not install all packages needed, so some formats may not administrator may not install all needed packages, so some formats may not be
work. Please, consult your system administrator for more information. Archives accessible. Please, consult your system administrator for more information.
could contain only regular files or directories (ie. no symlinks, block and Archives could contain only regular files or directories (no symlinks, block
character devices sockets or pipes allowed). Calling command is `extract` and and character devices sockets or pipes allowed). Calling command is `extract`
requires two arguments: and requires two arguments:
- path and name of the archive to extract - path and name of the archive to extract
- directory, where the archive will be extracted - directory, where the archive will be extracted
- **Fetch task** will give you a file. It can be downloaded from remote file - **Fetch task** will get a file. It can be downloaded from remote fileserver or
server or just copied from local cache if available. Calling comand is just copied from local cache if available. Calling comand is `fetch` with two
`fetch` with two arguments: arguments:
- name of the requested file without path (file sources are set up in worker - name of the requested file without path (file sources are set up in worker
configuratin file) configuratin file)
- path and name on the destination. Providing a different destination name - path and name on the destination. Providing a different destination name
can be used for easy rename. can be used for easy rename.
- **Copy task** can copy files and directories. Detailed info can be found on - **Copy task** can copy files and directories. Detailed info can be found on
reference page of reference page of
[boost::filesystem::copy](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#copy). [boost::filesystem::copy](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#copy).
Calling command is `cp` and require two arguments: Calling command is `cp` and require two arguments:
- path and name of source target - path and name of source target
- path and name of destination targer - path and name of destination targer
- **Make directory task** can create arbitrary number of directories. Calling - **Make directory task** can create arbitrary number of directories. Calling
command is `mkdir` and requires at least one argument. For each provided command is `mkdir` and requires at least one argument. For each provided
argument will be called argument will be called
[boost::filesystem::create_directories](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#create_directories) [boost::filesystem::create_directories](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#create_directories)
command. command.
- **Rename task** will rename files and directories. Detailed bahavior can be - **Rename task** will rename files and directories. Detailed bahavior can be
found on reference page of found on reference page of
[boost::filesystem::rename](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#rename). [boost::filesystem::rename](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#rename).
Calling command is `rename` and require two arguments: Calling command is `rename` and require two arguments:
- path and name of source target - path and name of source target
- path and name of destination target - path and name of destination target
- **Remove task** is for deleting files and directories. Calling command is `rm` - **Remove task** is for deleting files and directories. Calling command is `rm`
and require at least one argument. For each provided one will be called and require at least one argument. For each provided one will be called
[boost::filesystem::remove_all](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#remove_all) [boost::filesystem::remove_all](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#remove_all)
command. command.
### External tasks ### External tasks
@ -117,16 +116,19 @@ External tasks are arbitrary executables, typically ran inside isolate (with
given parameters) and the worker waits until they finish. The exit code given parameters) and the worker waits until they finish. The exit code
determines, whether the task succeeded (0) or failed (anything else). determines, whether the task succeeded (0) or failed (anything else).
- **stdin** -- can be configured to read from existing file or from `/dev/null`. There are several additional options:
- **stdout** and **stderr** -- can be individually redirected to a file or
discarded. If this output options are specified, than it is possible to upload - **stdin** -- task can be configured to read from existing file or from
output files with results by copying them into result directory. `/dev/null`.
- **stdout** and **stderr** -- task output can be individually redirected to a
file or discarded. If this output options are specified, than it is possible
to upload output files with results by copying them into result directory.
- **limits** -- task has time and memory limits; if these limits are exceeded, - **limits** -- task has time and memory limits; if these limits are exceeded,
the task is failed. the task is failed.
The task results (exit code, time, and memory consumption, etc.) are saved into The task results (exit code, time, and memory consumption, etc.) are saved into
result yaml file and sent back to frontend application to address which was result yaml file and sent back to the frontend application to address which was
specified on input. specified on the initiation of job evaluation.
## Judges ## Judges
@ -134,31 +136,25 @@ Judges are treated as normal external commands, so there is no special task type
for them. Binaries are installed alongside with worker executable in standard for them. Binaries are installed alongside with worker executable in standard
directories (on both Linux and Windows systems). directories (on both Linux and Windows systems).
Judges should be used for comparision of outputted files from execution tasks
and sample outputs fetched from fileserver. Results of this comparision should
be at least information if files are same or not. Extension for this is
percentual results based on correctness of given files. All of the judges
results have to be printed to standard output.
All packed judges are adopted from old CodEx with only very small modifications. All packed judges are adopted from old CodEx with only very small modifications.
ReCodEx judges base directory is in `${JUDGES_DIR}` variable, which can be used ReCodEx judges base directory is in `${JUDGES_DIR}` variable, which can be used
in job config file. in the job config file.
- **recodex-judge-normal** is base judge used by most of exercises. This judge - **recodex-judge-normal** is a base judge used by most of the exercises. This
compares two text files. It compares only text tokens regardless on amount of judge compares two text files. It compares only text tokens regardless on
whitespace between them. amount of whitespaces between them.
``` ```
Usage: recodex-judge-normal [-r | -n | -rn] <file1> <file2> Usage: recodex-judge-normal [-r | -n | -rn] <file1> <file2>
``` ```
- file1 and file2 are paths to files that will be compared - file1 and file2 are paths to files that will be compared
- switch options `-r` and `-n` can be specified as a 1st optional argument. - switch options `-r` and `-n` can be specified as optional arguments.
- `-n` judge will treat newlines as ordinary whitespace (it will ignore - `-n` judge will treat newlines as ordinary whitespace (it will ignore
line breaking) line breaking)
- `-r` judge will treat tokens as real numbers and compares them - `-r` judge will treat tokens as real numbers and compares them
accordingly (with some amount of error) accordingly (with some amount of error)
- **recodex-judge-filter** can be used for preprocess output files before real - **recodex-judge-filter** can be used for preprocessing output files before
judging. This judge filters C-like comments from a text file. The comment real judging. This judge filters C-like comments from a text file. The comment
starts with double slash sequence (`//`) and finishes with newline. If the starts with double slash sequence (`//`) and finishes with newline. If the
comment takes whole line, then whole line is filtered. comment takes whole line, then whole line is filtered.
``` ```
@ -167,9 +163,8 @@ in job config file.
- if `outputFile` is ommited, std. output is used instead. - if `outputFile` is ommited, std. output is used instead.
- if both files are ommited, application uses std. input and output. - if both files are ommited, application uses std. input and output.
- **recodex-judge-shuffle** is for judging results with semantics of set, where - **recodex-judge-shuffle** is for judging results with semantics of a set,
ordering is not important. This judge compares two text files and returns 0 where ordering is not important. Two files are compared with no regards for
if they matches (and 1 otherwise). Two files are compared with no regards for
whitespace (whitespace acts just like token delimiter). whitespace (whitespace acts just like token delimiter).
``` ```
Usage: recodex-judge-shuffle [-[n][i][r]] <file1> <file2> Usage: recodex-judge-shuffle [-[n][i][r]] <file1> <file2>
@ -181,24 +176,24 @@ in job config file.
## Configuration items ## Configuration items
- **submission** -- information about this particular submission - **submission** -- general information about this particular submission
- **job-id** -- textual ID which should be unique in whole recodex - **job-id** -- textual ID which should be unique in whole recodex
- _file-collector_ -- address from which fetch tasks will download data (API - _file-collector_ -- URL address from which fetch tasks will download data
will fill) (API will fill)
- _log_ -- default is false, can be omitted, determines whether job - _log_ -- default is false, can be omitted, determines whether job
execution will be logged into one shared log execution will be logged into one shared log file
- **hw-groups** -- list of hardware groups for which are specified limits in - **hw-groups** -- list of hardware groups for which are specified limits in
this configuration this configuration, multiple values separated by `|` symbol
- **tasks** -- list (not map) of individual tasks - **tasks** -- list (not map) of individual tasks
- **task-id** -- unique identifier of task in scope of one submission - **task-id** -- unique identifier of the task in scope of one submission
- _priority_ -- higher number, higher priority, defaults to 1 - _priority_ -- higher number, higher priority, defaults to 1
- _fatal-failure_ -- if true, than execution of whole job will be stopped - _fatal-failure_ -- if true, than execution of whole job will be stopped
after failing of this one, defaults to false after failing of this one, defaults to false
- _dependencies_ -- list of dependencies which have to be fulfilled before - _dependencies_ -- list of dependencies which have to be fulfilled before
this task, can be omitted if there is no dependencies this task, can be omitted if there is no dependencies; YAML list of values
- **cmd** -- description of command which will be executed - **cmd** -- description of command which will be executed
- **bin** -- the binary itself (full path of external command or name of - **bin** -- the binary itself (absolute path of external command or
internal task) name of internal task, job variables can be used)
- _args_ -- list of arguments which will be sent into execution unit - _args_ -- list of arguments which will be sent into execution unit
- _test-id_ -- ID of the test this task is part of -- must be specified for - _test-id_ -- ID of the test this task is part of -- must be specified for
tasks which the particular test's result depends on tasks which the particular test's result depends on
@ -213,11 +208,11 @@ in job config file.
defined task is automatically external defined task is automatically external
- **name** -- name of used sandbox - **name** -- name of used sandbox
- _stdin_ -- file to which standard input will be redirected, can be - _stdin_ -- file to which standard input will be redirected, can be
omitted omitted; job variables can be used, usualy `${EVAL_DIR}`
- _stdout_ -- file to which standard output will be redirected, can be - _stdout_ -- file to which standard output will be redirected, can be
omitted omitted; job variables can be used, usualy `${EVAL_DIR}`
- _stderr_ -- file to which error output will be redirected, can be - _stderr_ -- file to which error output will be redirected, can be
omitted omitted; job variables can be used, usualy `${EVAL_DIR}`
- _limits_ -- list of limits which can be passed to sandbox, can be - _limits_ -- list of limits which can be passed to sandbox, can be
omitted, in that case defaults will be used omitted, in that case defaults will be used
- **hw-group-id** -- determines specific limits for specific - **hw-group-id** -- determines specific limits for specific
@ -229,7 +224,7 @@ in job config file.
- _memory_ -- overall memory limit for application in kilobytes - _memory_ -- overall memory limit for application in kilobytes
- _parallel_ -- integral number of processes which can run - _parallel_ -- integral number of processes which can run
simultaneously, time and memory limits are merged from all simultaneously, time and memory limits are merged from all
potential processes/threads potential processes/threads, 0 for unlimited
- _disk-size_ -- size of all IO operations from/to files in - _disk-size_ -- size of all IO operations from/to files in
kilobytes kilobytes
- _disk-files_ -- number of files which can be opened - _disk-files_ -- number of files which can be opened
@ -239,8 +234,9 @@ in job config file.
- _bound-directories_ -- list of structures representing directories - _bound-directories_ -- list of structures representing directories
which will be visible inside sandbox, union with default worker which will be visible inside sandbox, union with default worker
configuration. Contains 3 suboptions: **src** -- source pointing configuration. Contains 3 suboptions: **src** -- source pointing
to actual system directory, **dst** -- destination inside sandbox to actual system directory (absolute path), **dst** -- destination
which can have its own filesystem binding and **mode** -- inside sandbox which can have its own filesystem binding (absolute
path inside sanboxed directory structure) and **mode** --
determines connection mode of specified directory, one of values: determines connection mode of specified directory, one of values:
RW (allow read-write access), NOEXEC (disallow execution of RW (allow read-write access), NOEXEC (disallow execution of
binaries), FS (mount device-less filesystem like `/proc`), MAYBE binaries), FS (mount device-less filesystem like `/proc`), MAYBE
@ -355,38 +351,39 @@ tasks:
## Results ## Results
Results of tasks are sent back in YAML format compressed into archive. This Results of tasks are sent back as YAML file in compressed results archive. This
archive can contain further files, such as job logging information and files archive can contain additional files, such as job logging information and files
which were explicitly copied into results directory. Results file contains job which were explicitly copied into results directory. Results file contains job
identification and results of individual tasks. identification and results of individual tasks.
### Results items ### Results items
- **job-id** -- identification of job to which this results belongs - **job-id** -- identification of the job
- **hw-group** -- Hardware group identifier of worker which performed the - **hw-group** -- hardware group identifier of worker which performed the
evaluation evaluation
- _error_message_ -- present only if whole execution failed and none of tasks - _error_message_ -- present only if whole execution failed and none of tasks
were executed were executed
- **results** -- list of tasks results - **results** -- list of tasks results
- **task-id** -- unique identification of task in scope of this job - **task-id** -- unique identification of a task in scope of this job
- **status** -- three states: OK (execution of task was successful; - **status** -- three states: OK (execution of task was successful;
sandboxed program could be killed, but sandbox exited normally), FAILED sandboxed program could be killed, but sandbox exited normally), FAILED
(error while executing task), SKIPPED (execution of task was skipped) (error while executing task), SKIPPED (execution of task was skipped)
- _error_message_ -- defined only in internal tasks on failure - _error_message_ -- defined only in internal tasks on failure
- _sandbox_results_ -- if defined than this task was external and was run in - _sandbox_results_ -- if defined than this task was external and was run in
sandbox sandbox
- **exitcode** -- integer which executed program gave on exit - **exitcode** -- exit code integer
- **time** -- time in seconds in which program exited - **time** -- time in which program exited in seconds
- **wall-time** -- wall time in seconds - **wall-time** -- wall time in seconds
- **memory** -- how much memory program used in kilobytes - **memory** -- how much memory program used in kilobytes
- **max-rss** -- maximum resident set size used in kilobytes - _max-rss_ -- maximum resident set size used in kilobytes (see manual
page of isolate)
- **status** -- two letter status code: OK (success), RE (runtime - **status** -- two letter status code: OK (success), RE (runtime
error), SG (program died on signal), TO (timed out), XX (internal error), SG (program died on signal), TO (timed out), XX (internal
error of the sandbox) error of the sandbox)
- **exitsig** -- description of exit signal - _exitsig_ -- description of exit signal
- **killed** -- boolean determining if program exited correctly or was - **killed** -- boolean determining if program exited correctly or was
killed killed
- **message** -- status message on failure - _message_ -- status message on failure
### Results example ### Results example

Loading…
Cancel
Save