diff --git a/Assignments-overview.md b/Assignments-overview.md index 89543c9..48c0ebc 100644 --- a/Assignments-overview.md +++ b/Assignments-overview.md @@ -4,10 +4,10 @@ Assignments are programming tasks that can be tested by a worker after a user submits their solution. An assignment is described by a YAML file that contains information on how to build, run and test it. -### Terminology +## Terminology Following text requires knowledge of basic terminology used by ReCodEx. Please, check [separate page](Terminology). -### Basics +## Basics Job is a set/list of tasks (it is generally a set, but order of tasks have some meaning). These tasks may have dependencies (arbitrary number), which needs to be observed. When recodex-worker processes job, it creates a task graph, where tasks are vertices and dependencies are edges (A -> B means that the task A is on the dependency list of task B) and creates its linear ordering. The graph must be acyclic (otherwise linear ordering will not exist) and the recodex-worker attempts to execute maximal number of tasks possible. Tasks without dependencies can be executed directly, other tasks are executed when all their dependencies have been successfully completed. Tasks are executed sequentially -- by the linear ordering of the task graph. Parallel tasks (tasks, which are not directly dependent and thus their linear ordering may be arbitrary) are ordered first by their priority (higher number => higher priority) and second by their order in the configuration file. Priority is important for specifying evaluation flow. See sample picture for better understanding. @@ -16,7 +16,7 @@ Tasks are executed sequentially -- by the linear ordering of the task graph. Par Each task has a unique ID (alphanum string like _CompileA_, _RunAA_, or _JudgeAB_ in the picture). These IDs are used to identify tasks (for dependency references, in the log, ...). Numbers in bottom right corner are priorities of each task. Higher number is greater priority. It means, that if task _RunAA_ is done, next must be _JudgeAA_ and not _RunAB_ (that will be also valid linear ordering, but _RunAB_ has lower priority). -### Task +## Task Task is an atomic piece of work executed by recodex-worker. There are two basic types of tasks: - **Execute external process** (optionally inside Isolate). Linux default is mandatory usage of isolate, this option is here because of Windows, where is currently no sandbox available. - **Perform internal operation**. External processes are meant for compilation, testing, or execution of external judges. Internal operations comprise commands, which are typically related to file/directory maintenance and other evaluation management stuff. Few important examples: @@ -26,7 +26,38 @@ Task is an atomic piece of work executed by recodex-worker. There are two basic Even though the internal operations may be handled by external executables (`mv`, `tar`, `pkzip`, `wget`, ...), it might be better to keep them inside the recodex-worker as it would simplify these operations and their portability among platforms. Furthermore, it is quite easy to implement them using common libraries (e.g., _zlib_, _curl_). -**External Tasks** +### Internal tasks + +**Archivate task** can be used for pack and compress a directory. Calling command is `archivate`. Requires two arguments: + +- path and name of the directory to be archived +- path and name of the target archive. Only `.zip` format is supported. + +**Extract task** is opposite to archivate task. It can extract different types of archives. Supported formats are the same as supports `libarchive` library (see [libarchive wiki](https://github.com/libarchive/libarchive/wiki)), mainly `zip`, `tar`, `tar.gz`, `tar.bz2` and `7zip`. Please note, that system administrator may not install all packages needed, so some formats may not work. Please, consult your system administrator for more information. Archives could contain only regular files or directories (ie. no symlinks, block and character devices sockets or pipes allowed). Calling command is `extract` and requires two arguments: + +- path and name of the archive to extract +- directory, where the archive will be extracted + +**Fetch task** will give you a file. It can be downloaded from remote file server or just copied from local cache if available. Calling comand is `fetch` with two arguments: + +- name of the requested file without path +- path and name on the destination. Providing a different destination name can be used for easy rename. + +**Copy task** can copy files and directories. Detailed info can be found on reference page of [boost::filesystem::copy](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#copy). Calling command is `cp` and require two arguments: + +- path and name of source target +- path and name of destination targer + +**Make directory task** can create arbitrary number of directories. Calling command is `mkdir` and requires at least one argument. For each provided one will be called [boost::filesystem::create_directories](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#create_directories) command. + +**Rename task** will rename files and directories. Detailed bahavior can be found on reference page of [boost::filesystem::rename](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#rename). Calling command is `rename` and require two arguments: + +- path and name of source target +- path and name of destination target + +**Remove task** is for deleting files and directories. Calling command is `rm` and require at least one argument. For each provided one will be called [boost::filesystem::remove_all](http://www.boost.org/doc/libs/1_60_0/libs/filesystem/doc/reference.html#remove_all) command. + +### External tasks These tasks are typically executed in isolate (with given parameters) and the `recodex-worker` waits until they finish. The exit code determines, whether the task succeeded (0) or failed (anything else). A task may be marked as essential; in such case, failure will immediately cause termination of the whole job. - **stdin** - can be configured to read from existing file or from `/dev/null`. @@ -35,24 +66,57 @@ These tasks are typically executed in isolate (with given parameters) and the `r The task results (exit code, time, and memory consumption, etc.) are saved into result yaml file and sent back to frontend application to address which was specified on input. -### Directories and Files -For each job execution unique directory structure is created. Job is not restricted to specified directories (tasks can do whatever is allowed on system), but it is advised to use them inside job. In recodex-worker configuration one can specify worker default directory, this is base of every file which is produced by recodex-worker. +### Judges -Inside this directory temporary files for job execution are created: -- **${DEFAULT}/downloads/${WORKER_ID}/${JOB_ID}** - where the downloaded archive is saved -- **${DEFAULT}/submission/${WORKER_ID}/${JOB_ID}** - decompressed submission is stored here -- **${DEFAULT}/eval/${WORKER_ID}/${JOB_ID}** - this directory is accessible in job configuration using variables and all execution should happen here -- **${DEFAULT}/temp/${WORKER_ID}/${JOB_ID}** - directory where all sort of temporary files can be stored -- **${DEFAULT}/results/${WORKER_ID}/${JOB_ID}** - again accessible directory from job configuration which is used to store all files which will be upload on fileserver, usually there will be only yaml result file and optionally log, every other file has to be copied here explicitly from job +Judges are treated as normal external command, so there is no special task for them. They should be used for comparision of outputted files from execution tasks and sample outputs. Results of this comparision should be at least information if files are same or not. Extension for this is percentual results based on similarity of given files. + +All packed judges are adopted from old Codex with only very small modifications. ReCodEx judges base directory is in `${JUDGES_DIR}` variable, which can be used in job config file. + +#### Judges interface -### Configuration -Configuration of the job which is passed to worker is generated from two parts: -- **template** - Common template for similar kinds of tasks. Contains allmost all instructions - when fetch, move, rename files, run commands, judges, ..., task dependencies and priorities. This template can be shared by more problem assignments or every problem (probably in compiller class) can have different one. -- **recodex-worker config** - includes data for instancioning the template, e.q. input file names, ... -Final configuration for worker is computer generated from those two configs. -Job configuration consist of some general information and then from list of tasks (one or more) +For future extensibility is **critical** that judges have some shared **interface** of calling and return values. +- Parameters: There are two mandatory positional parameters which has to be files for comparision +- Results: + - _everything OK_ + - exitcode: 0 + - stdout: there is one line with double value which should be percentage of similarity of two given files + - _error during execution_ + - exitcode: 1 + - stderr: there should be description of error -#### Configuration items +#### ReCodEx judges + +Below is list of judges which is packed with ReCodEx project and comply above requirements. + +**recodex-judge-normal** is base judge used by most of exercises. This judge compares two text files. It compares only text tokens regardless amount of whitespace between them. +``` +Usage: recodex-judge-normal [-r | -n | -rn] +``` +- file1 and file2 are paths to files that will be compared +- switch options `-r` and `-n` can be specified as a 1st optional argument. + - `-n` judge will treat newlines as ordinary whitespace (it will ignore line breaking) + - `-r` judge will treat tokens as real numbers and compares them accordingly (with some amount of error) + +**recodex-judge-filter** can be used for preprocess output files before real judging. This judge filters C-like comments from a text file. The comment starts with double slash sequence (`//`) and finishes with newline. If the comment takes whole line, then whole line is filtered. +``` +Usage: recodex-judge-filter [inputFile [outputFile]] +``` +- if `outputFile` is ommited, std. output is used instead. +- if both files are ommited, application uses std. input and output. + +**recodex-judge-shuffle** is for judging shuffled files. This judge compares two text files and returns 0 if they matches (and 1 otherwise). Two files are compared with no regards for whitespace (whitespace acts just like token delimiter). +``` +Usage: recodex-judge-shuffle [-[n][i][r]] +``` +- `-n` ignore newlines (newline is considered only a whitespace) +- `-i` ignore items order on the row (tokens on each row may be permutated) +- `-r` ignore order of rows (rows may be permutated); this option has no effect when `-n` is used + + +## Job configuration +Configuration of the job which is passed to worker is generated on demand by web API. Each job has unique one. + +### Configuration items Mandatory items are bold, optional italic. - **submission** - information about this particular submission - **job-id** - textual ID which should be unique in whole recodex @@ -91,7 +155,7 @@ Mandatory items are bold, optional italic. - **dst** - destination inside sandbox which can have its own filesystem binding - **mode** - determines connection mode of specified directory, one of values: RW, NOEXEC, FS, MAYBE, DEV -#### Configuration example +### Configuration example This configuration example is written in YAML and serves only for demostration purposes. Therefore it is not working example which can be used in real traffic. Some items can be omitted and defaults will be used. ```{.yml} @@ -193,7 +257,7 @@ tasks: ... ``` -### Job variables +## Job variables Because frontend does not know which worker gets the job, its necessary to be a little general in configuration file. This means that some worker specific things has to be transparent. Good example of this is directories, which can be placed whenever worker wants. In case of this variables were established. There are of course some restrictions where variables can be used. Basically whenever filesystem paths can be used, variables can be used. Usage of variables in configuration is then simple and kind of shell-like. Name of variable is put inside braces which are preceded with dollar sign. Real usage is than something like this: ${VAR}. There should be no quotes or apostrophies around variable name, just simple text in braces. Parsing is simple and whenever there is dollar sign with braces job execution unit automatically assumes that this is a variable, so there is no chance to have this kind of substring. @@ -207,6 +271,16 @@ List of usable variables in job configuration: - **TEMP_DIR** - general temp directory which is not dependent on operating system - **JUDGES_DIR** - directory in which judges are stored (outside sandbox) +## Directories and Files +For each job execution unique directory structure is created. Job is not restricted to specified directories (tasks can do whatever is allowed on system), but it is advised to use them inside job. DEFAULT variable represents worker's working directory specified in each one's configuration. No variable of this name is defined for use in job YAML configuration. + +Inside this directory temporary files for job execution are created: +- **${DEFAULT}/downloads/${WORKER_ID}/${JOB_ID}** - where the downloaded archive is saved +- **${DEFAULT}/submission/${WORKER_ID}/${JOB_ID}** - decompressed submission is stored here +- **${DEFAULT}/eval/${WORKER_ID}/${JOB_ID}** - this directory is accessible in job configuration using variables and all execution should happen here +- **${DEFAULT}/temp/${WORKER_ID}/${JOB_ID}** - directory where all sort of temporary files can be stored +- **${DEFAULT}/results/${WORKER_ID}/${JOB_ID}** - again accessible directory from job configuration which is used to store all files which will be upload on fileserver, usually there will be only yaml result file and optionally log, every other file has to be copied here explicitly from job + ## Results Results of tasks are sent back in YAML format compressed into archive. This archive can contain further files, such as job logging information and files which were explicitly copied into results directory. Results file contains job identification and results of individual tasks.