You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
recodex-wiki/Exercise-Configuration.md

420 lines
15 KiB
Markdown

8 years ago
# Exercise Configuration
In ReCodEx there are two configurations of exercise High Level Configuration (HiLC) and Low Level Configuration (LoLC). LoLC is used in backend, by workers for instance and should be general enough to create all kinds of worker tasks. On the other hand HiLC should be easy enough to be written or composed by normal application users, preferably in the form of graphical editation. But this configuration always has to be somehow stored, that is where this document comes in handy.
HiLC is divided in several parts which takes care of different things. There are **ExerciseConfig**, **Pipelines**, **Limits** and **EnvironmentConfig**. From these components configuration of exercise is composed and on every submit new LoLC is compiled from it.
8 years ago
7 years ago
## Compilation
7 years ago
Compilation is process which generates [[Job Configuration]] based on _Exercise Configuration_ described below.
### Steps
TODO
### Priorities of Tasks
7 years ago
* Compilation - 100
* Execution - 90
* Judge - 80
* {default} - 42
* Dump Results - 1
## Variables and Ports
7 years ago
In whole exercise configuration and appropriate structures variables and ports are used. All have to have a type. It was decided that there will be six types which should be sufficient for every possible usage. List of them follows:
* **string** - textual value
* **string[]** - array of strings
7 years ago
* **file** - corresponds to file created during evaluation of submission
* **file[]** - array of files
7 years ago
* **remote-file** - corresponds to external file which has to be downloaded during evaluation of submission
* **remote-file[]** - array of remote files
8 years ago
## ExerciseConfig
8 years ago
Represents basic exercise configuration which connects all things together. For some reasons there two formats of this configuration, one which is saved in the database and the other one which is sent back to web application. Both formats are described bellow.
### Frontend Format
7 years ago
Returned as JSON. There is one generated "environment" which is called "default" in here default pipelines for all tests can be found.
8 years ago
Mandatory items are bold, optional italic, description of items follows:
* **${list of environments}** - root element is list of environments
* **name** - identifier of the environment from database
* **tests** - list of tests
7 years ago
* **name** - identifier of the test which serves as unique identifier
* **pipelines** - list of pipelines contained in test
* **name** - identifier of pipeline database entity
* _variables_ - list of variables for this pipeline
* **name** - unique identifier of variable
* **type** - one of the supported types
* **value** - either single scalar value or array is variable is of array type
8 years ago
Example:
```
[
{
"name":"default",
"tests":[
{
"name":"Test 1",
"pipelines":[
{
"name":"pipeline1",
"variables":[
{
"name":"varA",
"type":"string",
"value":"valA"
}
]
}
]
},
{
"name":"Test 2",
"pipelines":[
{
"name":"pipeline2",
"variables":[
{
"name":"varB",
"type":"file",
"value":"valB"
}
]
}
]
}
]
},
{
"name":"java8",
"tests":[
{
"name":"Test 1",
"pipelines":[
{
"name":"pipelineJava",
"variables":[
{
"name":"varJava",
"type":"string",
"value":"valJava"
}
]
}
]
},
{
"name":"Test 2",
"pipelines":[
{
"name":"pipeline2",
"variables":[
{
"name":"varB",
"type":"file",
"value":"valB"
}
]
}
]
}
]
},
{
"name":"cpp11",
"tests":[
{
"name":"Test 1",
"pipelines":[
{
"name":"pipeline1",
"variables":[
{
"name":"varA",
"type":"string",
"value":"valA"
}
]
}
]
},
{
"name":"Test 2",
"pipelines":[
{
"name":"pipeline2",
"variables":[
{
"name":"varCpp",
"type":"file",
"value":"valCpp"
}
]
}
]
}
]
}
]
8 years ago
```
8 years ago
### Backend Format
Whole configuration consists of tests which should have defined pipelines from which they are composed. There are default pipelines and also special pipelines for runtime environments. If definition of some environment pipeline is missing they are taken from default pipelines of appropriate test.
Stored in yaml.
Mandatory items are bold, optional italic, description of items follows:
* **environments** - list of environments identifiers which belong to exercise
* **tests** - map of tests indexed by test unique identifier
7 years ago
* **${test identification}** - test unique identifier from database
* **environments** - map of environments which redefines default pipelines from this test
* **${environment identification}** - unique environment identifier from database
7 years ago
* **pipelines** - list of redefined pipelines, if this list is empty, then list of pipelines is replaced by defaults
* **name** - name of the pipeline to which following variables belongs to
* **variables** - list of variables
* **name** - unique identifier of the variable
* **type** - one of the supported variable types
* **value** - either single scalar value or array is variable is of array type
Example:
```
environments:
- java8
- cpp11
tests:
"Test 1":
pipelines:
- name: pipeline1
variables:
- name: varA
type: string
value: valA
environments:
java8:
pipelines:
- name: pipelineJava
variables:
- name: varJava
type: string
value: valJava
cpp11: []
"Test 2":
pipelines:
- name: pipeline2
variables:
- name: varB
type: file
value: valB
environments:
cpp11:
pipelines:
- name: pipeline2
variables:
- name: varCpp
type: file
value: valCpp
```
8 years ago
## Pipeline
Pipelines are sent to clients in JSON format and are stored in API in corresponding YAML with the same structure.
Important features:
* Every port either have to have defined reference to variable or it has to be blank. Actual value (for example string) is not allowed in port. If variable name is declared in port it has to exist in variables table.
* Connection between ports can be **one-to-one** or **one-to-many** from the perspective of output port. That means it is possible to have one output port which redirects variables to two or more input ports. Of course there has to be exception, it is allowed to have variable which is used only in input port, value of this variable has to be defined in pipeline variables table.
* Variables table in pipeline can contain **references** to external variables, these references can be directed to variables from environment configuration or exercise configuration. Variable is reference if it starts with the character **'$'**, variable cannot be used inside variable value (textual value "hello $world", where world should be reference, is not allowed). If for some reasons is needed to use variable value which starts with dollar sign it has to be escaped with backslash, so this "\$1 million" is actual value and not a reference.
### Boxes
* DataInBox and DataOutBox are special boxes which are treated differently from the others. This means that their deletion or even some breaking changes may have unforseen consequences. They are used for importing and exporting files in/out from pipeline. For importing string or array of strings, variable references have to be used. Inputs or outputs from pipeline may have been connected to another pipeline or to supervisor/student inputs.
* Data boxes have to be unconditionally used for importing or exporting files from pipelines. Variable references are not usable here since these references are only substitutions. For example files uploaded by supervisor (inputs and outputs) have to have input boxes in order to be properly downloaded from fileserver during execution.
* Every (except data boxes) box is used only in BoxService for creation purposes and then through abstract Box interface which is of course using inheritance for providing general usage schema. Thanks to this, creation of new boxes is quite simple and straightforward.
### Configuration
8 years ago
Mandatory items are bold, optional italic, description of items follows:
* **variables** - list of variables for this pipeline
* **name** - unique identifier of the variable
* **type** - one of the supported variable types
* **value** - either single scalar value or array is variable is of array type
* **boxes** - list of boxes which are defined in this pipeline
* **name** - unique identification of box
* **type** - one of the supported box types
* **portsIn** - map of input ports
* **${port identification}** - unique identification of port
* **type** - one of the supported port types
* **value** - reference to variable which has to be defined in pipeline variables table, also port has to match
* **portsOut** - map of output ports
* **${port identification}** - unique identifier of port
* **type** - one of the supported port types
* **value** - reference to variable which has to be defined in pipeline variables table, also port has to match
8 years ago
Example:
```
{
"variables":[
{
"name":"source_file",
"type":"file",
"value":"source.cpp"
}
],
"boxes": [
7 years ago
{
"name":"source",
"portsIn":[],
"portsOut":[{ "source_file":[{"type":"file", "value":"source_file"}] }],
"type":"data"
},
{
"name":"test",
"portsIn":[],
"portsOut":[{
"test_file":[{"type":"file", "value":"test_file"}],
"expected_output":[{"type":"file", "value":"expected_output"}]
}],
"type":"data"
},
{
"name":"compilation",
"portsIn":[{ "input_file":[{"type":"file", "value":"source_file"}] }],
"portsOut":[{ "output_file":[{"type":"file", "value":"binary_file"}] }],
"type":"compilation"
},
{
"name":"run",
"portsIn":[{ "binary_file":[{"type":"file", "value":"binary_file"}] }],
"portsOut":[{ "output_file":[{"type":"file", "value":"actual_output"}] }],
"type":"execution"
},
{
"name":"judge",
"portsIn":[{
"actual_output":[{"type":"file", "value":"actual_output"}],
"expected_output":[{"type":"file", "value":"expected_output"}]
}],
"portsOut":[{ "score":[{"type":"file", "value":"score"}] }],
"type":"evaluation"
}
]
}
8 years ago
```
8 years ago
## Limits
7 years ago
Limits are applied to whole test, that means if there are multiple execution tasks, all are going to have these same limits. Limits has to be specified in a way it contains at least one time limit and also memory limit.
8 years ago
Mandatory items are bold, optional italic, description of items follows:
7 years ago
* **${test identification}** - identifier of test from database
* _wall-time_ - elapsed real-time in seconds, defined as float
* _cpu-time_ - elapsed cpu-time in seconds, defined as float
7 years ago
* **memory** - maximal memory usage in kilobytes
* _parallel_ - maximal number of threads/processes used
8 years ago
Example:
```
test-id-1:
wall-time: 5
cpu-time: 6.4
memory: 50
parallel: 500
test-id-2:
wall-time: 6
memory: 60
8 years ago
```
8 years ago
## ExerciseEnvironmentConfig
Configuration for particular environments is stored here. This configuration can be seen in two formats the one which is returned to the web-app and the one in which configuration is stored. Environment configuration is stored in individual database entities, but it is desirable to return it as a whole for the whole exercise. Hence there appears to be two formats, both of them are described bellow.
Important features:
* Variable of type `file` or `file[]` in environment config can contain **wildcards**. These wildcards are then matched against files submitted in solution. For every wildcard/variable there has to be at least one file which match it.
* Variables table in exercise environment config can contain **references** to variables which should be given during submitting of solution. Variable is reference if it starts with the character **'$'**, variable cannot be used inside variable value (textual value "hello $world", where world should be reference, is not allowed). If for some reasons is needed to use variable value which starts with dollar sign it has to be escaped with backslash, so this "\$1 million" is actual value and not a reference.
### Frontend Format
Mandatory items are bold, optional italic, description of items follows:
* **{list of environments}** - root element is list of exercise environment configurations
* **runtimeEnvironmentId** - identification of environment taken from database
* **variablesTable** - list of variables
* **name** - unique identification of variable
* **type** - one of the supported variable types
* **value** - either single scalar value or array is variable is of array type
Example:
```
[
{
"runtimeEnvironmentId":"CRuntime",
"variablesTable":[
{
"name":"varA"
"type":"string",
"value":"valA"
},
{
"name":"varB"
"type":"file",
"value":"valB"
}
]
},
{
"runtimeEnvironmentId":"JavaRuntime",
"variablesTable":[
{
"name":"varA"
"type":"file",
"value":"javaA"
},
{
"name":"varB"
"type":"string",
"value":"javaB"
}
]
}
]
```
### Backend Format
8 years ago
In API environment configurations are stored differently from how they are returned to the web-app. For every runtime environment there is individual database entity which holds environment configuration. Therefore there is only need to store variables table.
8 years ago
Mandatory items are bold, optional italic, description of items follows:
* **variablesTable** - list of variables
* **name** - unique identification of variable
* **type** - one of the supported variable types
* **value** - either single scalar value or array is variable is of array type
8 years ago
Example:
```
variablesTable:
- name: varName
type: string
value: varValue
- name: source_file
type: file
value: source.cpp
```