recodex-wiki/Architecture.md

# Overall Architecture
![Overall Architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Overall_Architecture.png)

**ReCodEx** is designed to be very modular. **WebApp** + **File Server** are one instance of the application. They contain almost all logic of the app including _user management and authentication_, _storing and versioning tasks_, _counting and assigning points_ to users etc. One instance of the app can be connected to one or more **Workers** and one **Worker** can be connected to more instances of the **WebApp**. **Worker** is connected with **WebApp** through messaging queue.

## Worker
![Worker Architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Architecture.png)

**Worker's** main role is securely _compile_, _run_ and _evaluate_ given submit against model solutions provided by author of each task. It is logicaly divided into three objects:
- **Message Frontend** communicates with **WebApp** using messaging queue [ZeroMQ](http://zeromq.org/). It receives new submits, operates the evaluation through **Work API** and reports progress back.
- **Worker Core** can do all evaluating steps and is responsible for security of them. Sandbox [Isolate](https://github.com/ioi/isolate) is used.
- **File Server Frontend** ensures via **File API** access to files on **File Server**, where are stored testing inputs and corresponding outputs for each task and other required files. It's possible to upload files, too.

### Default worker configuration
Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml like in the job config.
```
---  # only one document with all configuration needed
job-collector:
    hostname: "localhost"
    port: 36587
file-manager:
    file-collector:
        hostname: "localhost"
        port: 80  # can be ignored in specific modules
        username: "654321"  # can be ignored in specific modules
        password: "123456"  # can be ignored in specific modules
    cache:  # only in case that there is cache module
        cache-dir: "/tmp/cache"
logger:
    file: "log.txt"
limits:
    time: 5  # in secs
    wall-time: 6  # seconds
    extra-time: 2  # seconds
    stack-size: 50000  # KB
    memory: 50000  # in KB
    parallel: false  # time and memory limits are merged
    disk-usage: 5  # MB
sandboxes-wrap-limits:
    - name: "isolate"
      time: 10  # in seconds
    - name: "csharp"
      time: 20  # in seconds
...
```

### Internal Worker architecture
![Internal Worker architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Internal_Architecture.png)

## File Server
![File Server Infrastructure](https://github.com/ReCodEx/GlobalWiki/blob/master/images/File_Server.png)

**File Server** stores data, that should be kept outside of **WebApp's** 
database (both because storing files in a database is inefficient and because 
the workers need to access the files in the simplest possible way). It should 
meet following requirements:
- store files without duplicates
- keep consistent state with main database
- serve files to workers on demand
- allow versioning of tasks with revert back feature

To meet these requirements, **Storage** and **Database** must be set as bellow.

### Storage
**Storage** is meant as disc space with some commonly used filesystem. We'll use `ext4`, but the other ones should work too. **Storage** file structure is:
```
.
├── submits
│   └── user_id
│       └── advanced_dot_net_1
│           ├── bf216fa9274261628f4d952a103c6cfd1cbbc587
│           └── e6ae49bbfda4a8bb57aceeb64fb117990b226ca5
├── tasks
│   ├── a
│   │   ├── a014ed2abb56371bfaf2b4298a85d5dfb56509ed
│   │   └── a5edbd8b12e670ed1e3110d6c0524000cd4c3c7a
│   └── b
│       └── b1696358b8540923eb79b68f95c0f94c13a83fa7
└── temp
    └── 1795184136b8bdddabe50453cc2cc2d46f0f7c5e
```
- **submits** keep information about all files submited by users to ReCodEx. 
  There are subdirectories _user_id_ and _advanced_dot_net_1_ which groups
  submits by users and courses the submits are for. This structure is easy to 
  maintain for new and deleted users.

- **tasks** contains all files for all existing task in ReCodEx. To avoid too 
  many files in one directory, files are separated to subfolders by first 
  character of their name.
- **temp** directory is dedicated to temporary storing outputs of programs on teachers' demand. This directory will be erased by cron job on daily basis.

### Database
For user friendly access and modifying tasks following information should be stored in database:
- list of tasks with their newest version number
- for every task and version list of used files (their hashed names)
- for every hash name one human readable filename

### Conclusion
Files are internally stored by their `sha1sum` hashes, so it's easy to implement 
versioning and get rid of files with duplicate content (multiple files can have 
the same content, which is only stored once). **Worker** also uses files by 
their hashes, which is great for local caching without worries about actual 
version number of given file. On the other hand, **Database** stores information 
about human readable names, so that the files are presented in a friendly way to 
users (teachers) in **WebApp**.

## Frontend - broker communication

The communication between the frontend and the workers is mediated by a broker 
that passes jobs to workers capable of processing them. 

### Assignment evaluation request

The frontend must send a multipart message that contains the following frames:

- The `eval` command
- The job id (ASCII or network byte order - to be specified)
- A frame for each header (e.g. `hwgroup=group_1`)
- A hash code of the assignment's configuration file
- Hash codes of files submitted by the user, each in a separate frame

If the broker is capable of routing the request to a worker, it responds with 
`ack`. Otherwise (for example when the requirements specified by the headers 
cannot be met), it responds with `nack`.

Note that we will need to store the job ID and the assignment configuration 
somewhere close to the submitted files so it's possible to check how a 
submission was evaluated. The job ID will likely be a part of the submission's 
path. The configuration could be linked there under some well-known name.

### Notifying the frontend about evaluation progress

The script that requested the evaluation will have exited by the time a worker 
processes the request. This issue remains to be resolved.

## Broker - worker communication

When a worker is started, it registers itself with the broker by sending the 
`init` command followed by headers that describe its capabilities (such as the 
number of threads it can run simultaneously, its hardware group, languages it 
can work with...).

Whenever the broker receives an assignment suitable for the worker, it just 
forwards the evaluation request message it originally received from the 
frontend. The worker has to:

- Download the assignment configuration file
- Download any supplementary files based on the configuration file, such as test 
  inputs or helper programs
- Download the source codes of the student's submission
- Evaluate the submission according to the assignment's configuration
- Upload the results of the evaluation to the file server

Thanks to this message structure, it's possible to cache the configuration file 
and only download the student's submissions when the same assignment is 
evaluated repeatedly for different students (a common case for homeworks and 
classroom assignments).

After finishing the evaluation, worker notifies the broker of this fact by 
sending:

- The `done` command
- The job id

This allows the broker to reliably distribute messages - if a worker doesn't 
succeed in processing a request (it doesn't respond in a time limit), the 
request can be sent to another worker.
Created Architecture (markdown) 9 years ago			`# Overall Architecture`
			`![Overall Architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Overall_Architecture.png)`

			`ReCodEx is designed to be very modular. WebApp + File Server are one instance of the application. They contain almost all logic of the app including _user management and authentication_, _storing and versioning tasks_, _counting and assigning points_ to users etc. One instance of the app can be connected to one or more Workers and one Worker can be connected to more instances of the WebApp. Worker is connected with WebApp through messaging queue.`

			`## Worker`
			`![Worker Architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Architecture.png)`

			`Worker's main role is securely _compile_, _run_ and _evaluate_ given submit against model solutions provided by author of each task. It is logicaly divided into three objects:`
			`- Message Frontend communicates with WebApp using messaging queue [ZeroMQ](http://zeromq.org/). It receives new submits, operates the evaluation through Work API and reports progress back.`
Add first File Server structure draft 9 years ago			`- Worker Core can do all evaluating steps and is responsible for security of them. Sandbox [Isolate](https://github.com/ioi/isolate) is used.`
			`- File Server Frontend ensures via File API access to files on File Server, where are stored testing inputs and corresponding outputs for each task and other required files. It's possible to upload files, too.`

Updated Architecture (markdown) 9 years ago			`### Default worker configuration`
			`Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml like in the job config.`
			```
Updated Architecture (markdown) 9 years ago			`--- # only one document with all configuration needed`
Updated Architecture (markdown) 9 years ago			`job-collector:`
			`hostname: "localhost"`
			`port: 36587`
			`file-manager:`
			`file-collector:`
			`hostname: "localhost"`
Updated Architecture (markdown) 9 years ago			`port: 80 # can be ignored in specific modules`
			`username: "654321" # can be ignored in specific modules`
			`password: "123456" # can be ignored in specific modules`
			`cache: # only in case that there is cache module`
Updated Architecture (markdown) 9 years ago			`cache-dir: "/tmp/cache"`
			`logger:`
			`file: "log.txt"`
			`limits:`
Updated Architecture (markdown) 9 years ago			`time: 5 # in secs`
			`wall-time: 6 # seconds`
			`extra-time: 2 # seconds`
			`stack-size: 50000 # KB`
			`memory: 50000 # in KB`
			`parallel: false # time and memory limits are merged`
			`disk-usage: 5 # MB`
Updated Architecture (markdown) 9 years ago			`sandboxes-wrap-limits:`
			`- name: "isolate"`
			`time: 10 # in seconds`
			`- name: "csharp"`
			`time: 20 # in seconds`
Updated Architecture (markdown) 9 years ago			`...`
			```

Updated Architecture (markdown) 9 years ago			`### Internal Worker architecture`
			`![Internal Worker architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Internal_Architecture.png)`

Add first File Server structure draft 9 years ago			`## File Server`
			`![File Server Infrastructure](https://github.com/ReCodEx/GlobalWiki/blob/master/images/File_Server.png)`

edits 9 years ago			`File Server stores data, that should be kept outside of WebApp's`
			`database (both because storing files in a database is inefficient and because`
			`the workers need to access the files in the simplest possible way). It should`
			`meet following requirements:`
Add first File Server structure draft 9 years ago			`- store files without duplicates`
			`- keep consistent state with main database`
			`- serve files to workers on demand`
Updated Architecture (markdown) 9 years ago			`- allow versioning of tasks with revert back feature`
Add first File Server structure draft 9 years ago
			`To meet these requirements, Storage and Database must be set as bellow.`

			`### Storage`
			Storage is meant as disc space with some commonly used filesystem. We'll use `ext4`, but the other ones should work too. Storage file structure is:
			```
			`.`
			`├── submits`
			`│ └── user_id`
			`│ └── advanced_dot_net_1`
			`│ ├── bf216fa9274261628f4d952a103c6cfd1cbbc587`
			`│ └── e6ae49bbfda4a8bb57aceeb64fb117990b226ca5`
			`├── tasks`
			`│ ├── a`
			`│ │ ├── a014ed2abb56371bfaf2b4298a85d5dfb56509ed`
			`│ │ └── a5edbd8b12e670ed1e3110d6c0524000cd4c3c7a`
			`│ └── b`
			`│ └── b1696358b8540923eb79b68f95c0f94c13a83fa7`
			`└── temp`
			`└── 1795184136b8bdddabe50453cc2cc2d46f0f7c5e`
			```
edits 9 years ago			`- submits keep information about all files submited by users to ReCodEx.`
			`There are subdirectories _user_id_ and _advanced_dot_net_1_ which groups`
			`submits by users and courses the submits are for. This structure is easy to`
			`maintain for new and deleted users.`

			`- tasks contains all files for all existing task in ReCodEx. To avoid too`
			`many files in one directory, files are separated to subfolders by first`
			`character of their name.`
Add first File Server structure draft 9 years ago			`- temp directory is dedicated to temporary storing outputs of programs on teachers' demand. This directory will be erased by cron job on daily basis.`

			`### Database`
			`For user friendly access and modifying tasks following information should be stored in database:`
			`- list of tasks with their newest version number`
			`- for every task and version list of used files (their hashed names)`
Updated Architecture (markdown) 9 years ago			`- for every hash name one human readable filename`

			`### Conclusion`
edits 9 years ago			Files are internally stored by their `sha1sum` hashes, so it's easy to implement
			`versioning and get rid of files with duplicate content (multiple files can have`
			`the same content, which is only stored once). Worker also uses files by`
			`their hashes, which is great for local caching without worries about actual`
			`version number of given file. On the other hand, Database stores information`
			`about human readable names, so that the files are presented in a friendly way to`
			`users (teachers) in WebApp.`
notes on communication between the components 9 years ago
			`## Frontend - broker communication`

			`The communication between the frontend and the workers is mediated by a broker`
			`that passes jobs to workers capable of processing them.`

			`### Assignment evaluation request`

			`The frontend must send a multipart message that contains the following frames:`

			- The `eval` command
			`- The job id (ASCII or network byte order - to be specified)`
			- A frame for each header (e.g. `hwgroup=group_1`)
			`- A hash code of the assignment's configuration file`
			`- Hash codes of files submitted by the user, each in a separate frame`

			`If the broker is capable of routing the request to a worker, it responds with`
			`ack`. Otherwise (for example when the requirements specified by the headers
			cannot be met), it responds with `nack`.

edits 9 years ago			`Note that we will need to store the job ID and the assignment configuration`
			`somewhere close to the submitted files so it's possible to check how a`
			`submission was evaluated. The job ID will likely be a part of the submission's`
			`path. The configuration could be linked there under some well-known name.`

notes on communication between the components 9 years ago			`### Notifying the frontend about evaluation progress`

			`The script that requested the evaluation will have exited by the time a worker`
			`processes the request. This issue remains to be resolved.`

			`## Broker - worker communication`

			`When a worker is started, it registers itself with the broker by sending the`
			`init` command followed by headers that describe its capabilities (such as the
			`number of threads it can run simultaneously, its hardware group, languages it`
			`can work with...).`

			`Whenever the broker receives an assignment suitable for the worker, it just`
			`forwards the evaluation request message it originally received from the`
			`frontend. The worker has to:`

			`- Download the assignment configuration file`
			`- Download any supplementary files based on the configuration file, such as test`
			`inputs or helper programs`
			`- Download the source codes of the student's submission`
			`- Evaluate the submission according to the assignment's configuration`
			`- Upload the results of the evaluation to the file server`

			`Thanks to this message structure, it's possible to cache the configuration file`
			`and only download the student's submissions when the same assignment is`
			`evaluated repeatedly for different students (a common case for homeworks and`
			`classroom assignments).`

			`After finishing the evaluation, worker notifies the broker of this fact by`
			`sending:`

			- The `done` command
			`- The job id`

			`This allows the broker to reliably distribute messages - if a worker doesn't`
			`succeed in processing a request (it doesn't respond in a time limit), the`
			`request can be sent to another worker.`