recodex-wiki/Overall-architecture.md

# Overall Architecture

**ReCodEx** is designed to be very modular. In the following picture main components are arranged into one possible configuration. Note, that connections between components are not fully accurate.

![Overall Architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Overall_Architecture.png)

**Web app** is main part of whole project for users. It provides nice user interface and is the only part, that interacts with outside world directly. **Web API** contains almost all logic of the app including _user management and authentication_, _storing and versioning files_ (with help of **File server**), _counting and assigning points_ to users etc. **Broker** is essential part of whole architecture and can be marked as single point of failure. It maintains list of available **Workers**, receives submissions from the **Web API** and routes them further and reports progress of evaluations back to the **Web app**. **Worker** securely runs each received job and evaluate it's results. **Monitor** resends evaluation progress messages to the **Web app** in order to be presented to users.

Almost whole communication goes through **Broker** and ZeroMQ messaging middleware. When **Web app** wants to execute submission then all datas are handed over to **Worker** through **Broker**, similar situation is with progress state which start in **Worker** goes through **Broker** then pass **Monitor** and end up in **Web app** (as WebSockets). Only part of communication, which does not include **Broker**, is communication with **File server** which is realized through HTTP commmunication. This communication can be initiated by **Web API** or by **Worker**, other services have no access to **File server**. Detailed view into communication is on separate page [[Communication]].

## Web app

- TODO

## Web API

- TODO

## Broker

- TODO

## Worker

**Worker's** main role is securely execute given submission and possibly _evaluate_ results against model solutions provided by submitter. **Worker** is logicaly divided into two parts:
- **Listener** - listens and communicates with **Broker** through [ZeroMQ](http://zeromq.org/). It receives new jobs, communicates with **Evaluator** part and sends back results or progress.
- **Evaluator** - gets jobs to evaluate from **Listener** part, evaluate them (possibly in sandbox) and get to know to other part that evaluation ended. This part also communicates with **Fileserver**, downloads needed files and uploads detailed results.

**Worker** after getting evaluation request has to:

- Download the archive containing submitted source files and configuration file
- Download any supplementary files based on the configuration file, such as test 
  inputs or helper programs (This is done on demand, using a `fetch` command
  in the assignment configuration)
- Evaluate the submission accordingly to job configuration
- During evaluation progress states can be sent back to **Broker**
- Upload the results of the evaluation to the **Fileserver**
- Notify **Broker** that the evaluation finished

### Internal Worker architecture
Picture below is overall internal architecture of worker which shows its defined classes with private variables and public functions. Vector version of this picture is available [here](https://github.com/ReCodEx/GlobalWiki/raw/master/images/Worker_Internal_Architecture.pdf).
![Internal Worker architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Internal_Architecture.png)

## File Server

**File Server** stores data, that should be kept outside of **WebApp's** 
database (both because storing files in a database is inefficient and because 
the workers need to access the files in the simplest possible way). It should 
meet following requirements:
- store files without duplicates
- keep consistent state with main database
- serve files to workers on demand
- allow versioning of tasks with revert back feature

To meet these requirements, **Storage** and **Database** must be set as bellow.

### Storage
**Storage** is meant as disc space with some commonly used filesystem. We'll use `ext4`, but the other ones should work too. **Storage** file structure is:
```
.
├── submits
│   └── user_id
│       └── advanced_dot_net_1
│           └── submit_id
│               ├── eval.yml
│               └── source.cs
├── submit_archives
│   └── submit_id.tar.gz
├── tasks
│   ├── a
│   │   ├── a014ed2abb56371bfaf2b4298a85d5dfb56509ed
│   │   └── a5edbd8b12e670ed1e3110d6c0524000cd4c3c7a
│   └── b
│       └── b1696358b8540923eb79b68f95c0f94c13a83fa7
└── temp
    └── 1795184136b8bdddabe50453cc2cc2d46f0f7c5e
```
- **submits** keep information about all files submited by users to ReCodEx. 
  There are subdirectories _user_id_ and _advanced_dot_net_1_ which groups
  submits by users and courses the submits are for. This structure is easy to 
  maintain for new and deleted users.
- **submit_archives** contains the student submissions in compressed archives so 
  that they can be easily downloaded by workers.
- **tasks** contains supplementary files (such as test inputs or helper 
  programs) for all existing task in ReCodEx. To avoid too many files in one 
  directory, files are separated to subfolders by first character of their name.
- **temp** directory is dedicated to temporary storing outputs of programs on teachers' demand. This directory will be erased by cron job on daily basis.

### Database
For user friendly access and modifying tasks following information should be stored in database:
- list of tasks with their newest version number
- for every task and version list of used files (their hashed names)
- for every hash name one human readable filename

### Conclusion
Files are internally stored by their `sha1sum` hashes, so it's easy to implement 
versioning and get rid of files with duplicate content (multiple files can have 
the same content, which is only stored once). **Worker** also uses files by 
their hashes, which is great for local caching without worries about actual 
version number of given file. On the other hand, **Database** stores information 
about human readable names, so that the files are presented in a friendly way to 
users (teachers) in **WebApp**.

## Monitor

- TODO
Created Architecture (markdown) 9 years ago			`# Overall Architecture`
Updated Architecture (markdown) 8 years ago
			`ReCodEx is designed to be very modular. In the following picture main components are arranged into one possible configuration. Note, that connections between components are not fully accurate.`

Created Architecture (markdown) 9 years ago			`![Overall Architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Overall_Architecture.png)`

Updated Architecture (markdown) 8 years ago			Web app is main part of whole project for users. It provides nice user interface and is the only part, that interacts with outside world directly. Web API contains almost all logic of the app including _user management and authentication_, _storing and versioning files_ (with help of File server), _counting and assigning points_ to users etc. Broker is essential part of whole architecture and can be marked as single point of failure. It maintains list of available Workers, receives submissions from the Web API and routes them further and reports progress of evaluations back to the Web app. Worker securely runs each received job and evaluate it's results. Monitor resends evaluation progress messages to the Web app in order to be presented to users.

			Almost whole communication goes through Broker and ZeroMQ messaging middleware. When Web app wants to execute submission then all datas are handed over to Worker through Broker, similar situation is with progress state which start in Worker goes through Broker then pass Monitor and end up in Web app (as WebSockets). Only part of communication, which does not include Broker, is communication with File server which is realized through HTTP commmunication. This communication can be initiated by Web API or by Worker, other services have no access to File server. Detailed view into communication is on separate page [[Communication]].
new introduction description 8 years ago
Updated Architecture (markdown) 8 years ago			`## Web app`

			`- TODO`
new introduction description 8 years ago
Updated Architecture (markdown) 8 years ago			`## Web API`
new introduction description 8 years ago
			`- TODO`

			`## Broker`

			`- TODO`
Created Architecture (markdown) 9 years ago
			`## Worker`

worker part modified accordingly 8 years ago			`Worker's main role is securely execute given submission and possibly _evaluate_ results against model solutions provided by submitter. Worker is logicaly divided into two parts:`
			`- Listener - listens and communicates with Broker through [ZeroMQ](http://zeromq.org/). It receives new jobs, communicates with Evaluator part and sends back results or progress.`
			`- Evaluator - gets jobs to evaluate from Listener part, evaluate them (possibly in sandbox) and get to know to other part that evaluation ended. This part also communicates with Fileserver, downloads needed files and uploads detailed results.`
Add first File Server structure draft 9 years ago
worker part modified accordingly 8 years ago			`Worker after getting evaluation request has to:`
Updated Architecture (markdown) 8 years ago
worker part modified accordingly 8 years ago			`- Download the archive containing submitted source files and configuration file`
Updated Architecture (markdown) 8 years ago			`- Download any supplementary files based on the configuration file, such as test`
worker part modified accordingly 8 years ago			inputs or helper programs (This is done on demand, using a `fetch` command
Updated Architecture (markdown) 8 years ago			`in the assignment configuration)`
worker part modified accordingly 8 years ago			`- Evaluate the submission accordingly to job configuration`
			`- During evaluation progress states can be sent back to Broker`
			`- Upload the results of the evaluation to the Fileserver`
			`- Notify Broker that the evaluation finished`
Updated Architecture (markdown) 8 years ago
Updated Architecture (markdown) 9 years ago			`### Internal Worker architecture`
worker part modified accordingly 8 years ago			`Picture below is overall internal architecture of worker which shows its defined classes with private variables and public functions. Vector version of this picture is available [here](https://github.com/ReCodEx/GlobalWiki/raw/master/images/Worker_Internal_Architecture.pdf).`
Updated Architecture (markdown) 9 years ago			`![Internal Worker architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Internal_Architecture.png)`

Add first File Server structure draft 9 years ago			`## File Server`

edits 9 years ago			`File Server stores data, that should be kept outside of WebApp's`
			`database (both because storing files in a database is inefficient and because`
			`the workers need to access the files in the simplest possible way). It should`
			`meet following requirements:`
Add first File Server structure draft 9 years ago			`- store files without duplicates`
			`- keep consistent state with main database`
			`- serve files to workers on demand`
Updated Architecture (markdown) 9 years ago			`- allow versioning of tasks with revert back feature`
Add first File Server structure draft 9 years ago
			`To meet these requirements, Storage and Database must be set as bellow.`

			`### Storage`
			Storage is meant as disc space with some commonly used filesystem. We'll use `ext4`, but the other ones should work too. Storage file structure is:
			```
			`.`
			`├── submits`
			`│ └── user_id`
			`│ └── advanced_dot_net_1`
updated the messagigng protocol and file server dir structure 9 years ago			`│ └── submit_id`
			`│ ├── eval.yml`
			`│ └── source.cs`
			`├── submit_archives`
			`│ └── submit_id.tar.gz`
Add first File Server structure draft 9 years ago			`├── tasks`
			`│ ├── a`
			`│ │ ├── a014ed2abb56371bfaf2b4298a85d5dfb56509ed`
			`│ │ └── a5edbd8b12e670ed1e3110d6c0524000cd4c3c7a`
			`│ └── b`
			`│ └── b1696358b8540923eb79b68f95c0f94c13a83fa7`
			`└── temp`
			`└── 1795184136b8bdddabe50453cc2cc2d46f0f7c5e`
			```
edits 9 years ago			`- submits keep information about all files submited by users to ReCodEx.`
			`There are subdirectories _user_id_ and _advanced_dot_net_1_ which groups`
			`submits by users and courses the submits are for. This structure is easy to`
			`maintain for new and deleted users.`
updated the messagigng protocol and file server dir structure 9 years ago			`- submit_archives contains the student submissions in compressed archives so`
			`that they can be easily downloaded by workers.`
			`- tasks contains supplementary files (such as test inputs or helper`
			`programs) for all existing task in ReCodEx. To avoid too many files in one`
			`directory, files are separated to subfolders by first character of their name.`
Add first File Server structure draft 9 years ago			`- temp directory is dedicated to temporary storing outputs of programs on teachers' demand. This directory will be erased by cron job on daily basis.`

			`### Database`
			`For user friendly access and modifying tasks following information should be stored in database:`
			`- list of tasks with their newest version number`
			`- for every task and version list of used files (their hashed names)`
Updated Architecture (markdown) 9 years ago			`- for every hash name one human readable filename`

			`### Conclusion`
edits 9 years ago			Files are internally stored by their `sha1sum` hashes, so it's easy to implement
			`versioning and get rid of files with duplicate content (multiple files can have`
			`the same content, which is only stored once). Worker also uses files by`
			`their hashes, which is great for local caching without worries about actual`
			`version number of given file. On the other hand, Database stores information`
			`about human readable names, so that the files are presented in a friendly way to`
			`users (teachers) in WebApp.`
new introduction description 8 years ago
			`## Monitor`

			`- TODO`