recodex-wiki/Implementation.md

# Implementation

## Broker

The broker is a central part of the ReCodEx backend that directs most of the
communication. It was designed to maintain a heavy load of messages by making
only small actions in the main communication thread and asynchronous execution
of other actions.

The responsibilities of broker are:

- allowing workers to register themselves and keep track of their capabilities
- tracking status of each worker and handle cases when they crash
- accepting assignment evaluation requests from the frontend and forwarding them
  to workers
- receiving a job status information from workers and forward them to the
  frontend either via monitor or REST API
- notifying the frontend on errors of the backend

### Internal Structure

The main work of the broker is to handle incoming messages. For that a _reactor_
subcomponent is written to bind events on sockets to handler classes.  There are
currently two handlers -- one that handles the main functionality and the other
that sends status reports to the REST API asynchronously. This prevents broker
freezes when synchronously waiting for responses of HTTP requests, especially
when some kind of error happens on the server.

Main handler takes care of requests from workers and API servers:

- *init* -- initial connection from worker to broker
- *done* -- currently processed job on worker was executed and is done
- *ping* -- worker proving that it is still alive
- *progress* -- job progress state from worker which is immediately forwarded
  to monitor
- *eval* -- request from API server to execute given job

Second handler is asynchronous status notifier which is able to execute HTTP
requests. This notifier is used on error reporting from backend to frontend API.

#### Worker Registry

The `worker_registry` class is used to store information about workers, their
status and the jobs in their queue. It can look up a worker using the headers
received with a request (a worker is considered suitable if and only if it
satisfies all the job headers). The headers are arbitrary key-value pairs, which
are checked for equality by the broker. However, some headers require special
handling, namely `threads`, for which we check if the value in the request is
lesser than or equal to the value advertised by the worker, and `hwgroup`, for
which we support requesting one of multiple hardware groups by listing multiple
names separated with a `|` symbol (e.g. `group_1|group_2|group_3`.

The registry also implements a basic load balancing algorithm -- the workers are
contained in a queue and whenever one of them receives a job, it is moved to its
end, which makes it less likely to receive another job soon.

When a worker is assigned a job, it will not be assigned another one until a
`done` message is received.

#### Error Reporting

Broker is the only backend component which is able to report errors directly to
the REST API. Other components have to notify the broker first and it forwards
the messages to the API. For HTTP communication a *libcurl* library is used. To
address security concerns there is a *HTTP Basic Auth* configured on particular
API endpoints and correct credentials have to be entered.

Following types of failures are distinguished:

**Job failure** -- there are two ways a job can fail, internal and external one.
 An internal failure is the fault of worker, for example when it
 cannot download a file needed for the evaluation. An external
 error is for example when the job configuration is malformed. Note that wrong
 student solution is not considered as a job failure.

 Jobs that failed internally are reassigned until a limit on the amount of
 reassignments (configurable with the `max_request_failures` option) is reached.
 External failures are reported to the frontend immediately.

**Worker failure** -- when a worker crash is detected, an attempt to reassign
 its current job and also all the jobs from its queue is made. Because the
 current job might be the reason of the crash, its reassignment is also counted
 towards the `max_request_failures` limit (the counter is shared). If there is
 no worker that could process a job available (i.e. it cannot be reassigned),
 the job is reported as failed to the frontend via REST API.

**Broker failure** -- when the broker itself crashed and is restarted, workers
 will reconnect automatically. However, all jobs in their queues are lost. If a
 worker manages to finish a job and notifies the "new" broker, the report is
 forwarded to the frontend. The same goes for external failures. Jobs that fail
 internally cannot be reassigned, because the "new" broker does not know their
 headers -- they are reported as failed immediately.

### Additional Libraries

Broker implementation depends on several open-source C and C++ libraries.

- **libcurl** -- Libcurl is used for notifying REST API on job finish event over
  HTTP protocol. Due to lack of documentation of all C++ bindings the plain C
  API is used.
- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
  basicaly contains only one header file, but its API fits into the object
  architecture of the broker.
- **spdlog** -- Spdlog is small, fast and modern logging library used for system
  logging. It is highly customizable and configurable from the
  configuration of the broker.
- **yaml-cpp** -- Yaml-cpp is used for parsing broker configuration text file in
  YAML format.
- **boost-filesystem** -- Boost filesystem is used for managing logging
  directory (create if necessary) and parsing filesystem paths from strings as
  written in the configuration of the broker. Filesystem operations will be
  included in future releases of C++ standard, so this dependency may be
  removed in the future.
- **boost-program_options** -- Boost program options is used for
  parsing of command line positional arguments. It is possible to use POSIX
  `getopt` C function, but we decided to use boost, which provides nicer API and
  is already used by worker component.


## Fileserver

The fileserver component provides a shared file storage between the frontend and
the backend. It is written in Python 3 using Flask web framework. Fileserver
stores files in configurable filesystem directory, provides file deduplication
and HTTP access. To keep the stored data safe, the fileserver should not be
visible from public internet. Instead, it should be accessed indirectly through
the REST API.

### File Deduplication

From our analysis of the requirements, it is certain we need to implement a
means of dealing with duplicate files.

File deduplication is implemented by storing files under the hashes of their
content. This procedure is done completely inside fileserver. Plain files are
uploaded into fileserver, hashed, saved and the new filename is returned back to
the uploader.

SHA1 is used as hashing function, because it is fast to compute and provides
reasonable collision safety for non-cryptographic purposes. Files with the same
hash are treated as the same, no additional checks for collisions are performed.
However, it is really unlikely to find one. If SHA1 proves insufficient, it is
possible to change the hash function to something else, because the naming
strategy is fully contained in the fileserver (special care must be taken to
maintain backward compatibility).

### Storage Structure

Fileserver stores its data in following structure:

- `./submissions/<id>/` -- folder that contains files submitted by users (the
  solutions to the assignments of the student). `<id>` is an identifier received
  from the REST API.
- `./submission_archives/<id>.zip` -- ZIP archives of all submissions. These are
  created automatically when a submission is uploaded. `<id>` is an identifier
  of the corresponding submission.
- `./exercises/<subkey>/<key>` -- supplementary exercise files (e.g. test inputs
  and outputs). `<key>` is a hash of the file content (`sha1` is used) and
  `<subkey>` is its first letter (this is an attempt to prevent creating a flat
  directory structure).
- `./results/<id>.zip` -- ZIP archives of results for submission with `<id>`
  identifier.


## Worker

The job of the worker is to securely execute a job according to its
configuration and upload results back for latter processing. After receiving an
evaluation request, worker has to do following:

- download the archive containing submitted source files and configuration file
- download any supplementary files based on the configuration file, such as test
  inputs or helper programs (this is done on demand, using a `fetch` command
  in the assignment configuration)
- evaluate the submission according to job configuration
- during evaluation progress messages can be sent back to broker
- upload the results of the evaluation to the fileserver
- notify broker that the evaluation finished

### Internal Structure

Worker is logically divided into two parts:

- **Listener** -- communicates with broker through ZeroMQ. On startup, it
  introduces itself to the broker. Then it receives new jobs, passes them to
  the evaluator part and sends back results and progress reports.
- **Evaluator** -- gets jobs from the listener part, evaluates them (possibly in
  sandbox) and notifies the other part when the evaluation ends. Evaluator also
  communicates with fileserver, downloads supplementary files and uploads
  detailed results.

These parts run in separate threads of the same process and communicate through
ZeroMQ in-process sockets. Alternative approach would be using shared memory
region with unique access, but messaging is generally considered safer. Shared
memory has to be used very carefully because of race condition issues when
reading and writing concurrently. Also, messages inside worker are small, so
there is no big overhead copying data between threads. This multi-threaded
design allows the worker to keep sending `ping` messages even when it is
processing a job.

### Capability Identification

There are possibly multiple worker instances in a ReCodEx installation and each
one can run on different hardware, operating system, or have different tools
installed. To identify the hardware capabilities of a worker, we use the concept
of **hardware groups**. Each worker belongs to exactly one group that specifies
the hardware and operating system on which the submitted programs will be run. A
worker also has a set of additional properties called **headers**. Together they
help the broker to decide which worker is suitable for processing a job
evaluation request. This information is sent to the broker on worker startup.

The hardware group is a string identifier of the hardware configuration, for
example "i7-4560-quad-ssd-linux" configured by the administrator for each worker
instance. If this is done correctly, performance measurements of a submission
should yield the same results on all computers from the same hardware group.
Thanks to this fact, we can use the same resource limits on every worker in a
hardware group.

The headers are a set of key-value pairs that describe the worker capabilities.
For example, they can show which runtime environments are installed or whether
this worker measures time precisely. Headers are also configured manually by an
administrator.

### Running Student Submissions

Student submissions are executed in a sandbox environment to prevent them from
damaging the host system and also to restrict the amount of used resources.
Currently, only the Isolate sandbox support is implemented, but it is possible
to add support for another sandbox.

Every sandbox, regardless of the concrete implementation, has to be a command
line application taking parameters with arguments, standard input or file.
Outputs should be written to a file or to the standard output. There are no
other requirements, the design of the worker is very versatile and can be
adapted to different needs.

The sandbox part of the worker is the only one which is not portable, so
conditional compilation is used to include only supported parts of the project.
Isolate does not work on Windows environment, so also its invocation is done
through native calls of Linux OS (`fork`, `exec`). To disable compilation of
this part on Windows, the `#ifndef _WIN32` guard is used around affected files.

Isolate in particular is executed in a separate Linux process created by `fork`
and `exec` system calls. Communication between processes is performed through an
unnamed pipe with standard input and output descriptors redirection. To prevent
Isolate failure there is another safety guard -- whole sandbox is killed when it
does not end in `(time + 300) * 1.2` seconds where `time` is the original
maximum time allowed for the task. This formula works well both for short and
long tasks, but the timeout should never be reached if Isolate works properly --
it should always end itself in time.

### Directories and Files

During a job execution the worker has to handle several files -- input archive
with submitted sources and job configuration, temporary files generated during
execution or fetched testing inputs and outputs. For each job is created a
separate directory structure which is removed after finishing the job.

The files are stored in local filesystem of the worker computer in a
configurable location. The job is not restricted to use only specified
directories (tasks can do anything that is allowed by the system), but it is
advised not to write outside them. In addition, sandboxed tasks are usually
restricted to use only a specific (evaluation) directory.

The following directory structure is used for execution. The working directory
of the worker (root of the following paths) is shared for multiple instances on
the same computer.

- `downloads/${WORKER_ID}/${JOB_ID}` -- place to store the downloaded archive
  with submitted sources and job configuration
- `submission/${WORKER_ID}/${JOB_ID}` -- place to store a decompressed
  submission archive
- `eval/${WORKER_ID}/${JOB_ID}` -- place where all the execution should happen
- `temp/${WORKER_ID}/${JOB_ID}` -- place for temporary files
- `results/${WORKER_ID}/${JOB_ID}` -- place to store all files which will be
  uploaded on the fileserver, usually only yaml result file and optionally log
  file, other files have to be explicitly copied here if requested

Some of the directories are accessible during job execution from within sandbox
through predefined variables. List of these is described in job configuration
appendix.

### Judges

ReCodEx provides a few initial judges programs. They are mostly adopted from
CodEx and installed automatically with the worker component. Judging programs
have to meet some requirements. Basic ones are inspired by standard `diff`
application -- two mandatory positional parameters which have to be the files
for comparison and exit code reflecting if the result is correct (0) or wrong
(1).

This interface lacks support for returning additional data by the judges, for
example similarity of the two files calculated as the Levenshtein edit distance.
To allow passing these additional values an extended judge interface can be
implemented:

- Parameters: There are two mandatory positional parameters which have to be
  files for comparison
- Results:
    - _comparison OK_
		- exitcode: 0
		- stdout: there is a single line with a double value which
		  should be quality percentage of the judged file
    - _comparison BAD_
		- exitcode: 1
		- stdout: can be empty
	- _error during execution_
		- exitcode: 2
		- stderr: there should be description of error

The additional double value is saved to the results file and can be used for
score calculation in the frontend. If just the basic judge is used, the values
are 1.0 for exit code 0 and 0.0 for exit code 1.

If more values are needed for score computation, multiple judges can be used in
sequence and the values used together. However, extended judge interface should
comply most of possible use cases.

### Additional Libraries

Worker implementation depends on several open-source C and C++ libraries. All of
them are multi-platform, so both Linux and Windows builds are possible.

- **libcurl** -- Libcurl is used for all HTTP communication, that is downloading
  and uploading files. Due to lack of documentation of all C++ bindings the
  plain C API is used.
- **libarchive** -- Libarchive is used for compressing and extracting archives.
  Actual supported formats depends on installed packages on target system, but
  at least ZIP and TAR.GZ should be available.
- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
  basicaly contains only one header file, but its API fits into the object
  architecture of the worker.
- **spdlog** -- Spdlog is small, fast and modern logging library. It is used for
  all of the logging, both system and job logs. It is highly customizable and
  configurable from the configuration of the worker.
- **yaml-cpp** -- Yaml-cpp is used for parsing and creating text files in YAML
  format. That includes the configuration of the worker, the configuration and
  the results of a job.
- **boost-filesystem** -- Boost filesystem is used for multi-platform
  manipulation with files and directories. However, these operations will be
  included in future releases of C++ standard, so this dependency may be removed
  in the future.
- **boost-program_options** -- Boost program options is used for multi-platform
  parsing of command line positional arguments. It is not necessary to use it,
  similar functionality can be implemented be ourselves, but this well known
  library is effortless to use.

## Monitor

Monitor is an optional part of the ReCodEx solution for reporting progress of
job evaluation back to users in the real time. It is written in Python, tested
versions are 3.4 and 3.5. Following dependencies are used:

- **zmq** -- binding to ZeroMQ message framework
- **websockets** -- framework for communication over WebSockets
- **asyncio** -- library for fast asynchronous operations
- **pyyaml** -- parsing YAML configuration files

There is just one monitor instance required per broker. Also, monitor has to be
publicly visible (has to have public IP address or be behind public proxy
server) and also needs a connection to the broker. If the web application is
using HTTPS, it is required to use a proxy for monitor to provide encryption
over WebSockets. If this is not done, browsers of the users will block
unencrypted connection and will not show the progress to the users.

### Message Flow

![Message flow inside monitor](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)

Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all
components (logger for example), starts the other thread and runs the ZeroMQ
part of the application. This thread receives and parses incoming messages from
broker and forwards them to _thread 2_ sending logic.

_Thread 2_ is responsible for managing all of WebSocket connections
asynchronously. Whole thread is one big _asyncio_ event loop through which all
actions are processed. None of custom data types in Python are thread-safe, so
all events from other threads (actually only `send_message` method invocation)
must be called within the event loop (via `asyncio.loop.call_soon_threadsafe`
function). Please note, that most of the Python interpreters use [Global
Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there
is actually no parallelism in the performance point of view, but proper
synchronization is still required.

### Handling of Incoming Messages

Incoming ZeroMQ progress message is received and parsed to JSON format (same as
our WebSocket communication format). JSON string is then passed to _thread 2_
for asynchronous sending. Each message has an identifier of channel where to
send it to.

There can be multiple receivers to one channel id. Each one has separate
_asyncio.Queue_ instance where new messages are added. In addition to that,
there is one list of all messages per channel. If a client connects a bit later
than the point when monitor starts to receive messages, it will receive all
messages from the beginning. Messages are stored 5 minutes after last progress
command (normally FINISHED) is received, then are permanently deleted. This
caching mechanism was implemented because early testing shows, that first couple
of messages are missed quite often.

Messages from the queue of the client are sent through corresponding WebSocket
connection via main event loop as soon as possible. This approach with separate
queue per connection is easy to implement and guarantees reliability and order
of message delivery.

## Cleaner

Cleaner component is tightly bound to the worker. It manages the cache folder of
the worker, mainly deletes outdated files. Every cleaner instance maintains one
cache folder, which can be used by multiple workers. This means on one server
there can be numerous instances of workers with the same cache folder, but there
should be only one cleaner instance.

Cleaner is written in Python 3 programming language, so it works well
multi-platform. It uses only `pyyaml` library for reading configuration file and
`argparse` library for processing command line arguments.

It is a simple script which checks the cache folder, possibly deletes old files
and then ends. This means that the cleaner has to be run repeatedly, for example
using cron, systemd timer or Windows task scheduler. For proper function of the
cleaner a suitable cronning interval has to be used. It is recommended to use
24 hour interval which is sufficient enough for intended usage. The value is set
in the configuration file of the cleaner.

## REST API

The REST API is a PHP application run in an HTTP server. Its purpose is
providing controlled access to the evaluation backend and storing the state of
the application.

### Used Technologies

We chose to use PHP in version 7.0, which was the most recent version at the
time of starting the project. The most notable new feature is optional static
typing of function parameters and return values. We use this as much as possible
to enable easy static analysis with tools like PHPStan. Using static analysis
leads to less error-prone code that does not need as many tests as code that
uses duck typing and relies on automatic type conversions. We aim to keep our
codebase compatible with new releases of PHP.

To speed up the development and to make it easier to follow best practices, we
decided to use the Nette framework. The framework itself is focused on creating
applications that render HTML output, but a lot of its features can be used in a
REST application, too.

Doctrine 2 ORM is used to provide a layer of abstraction over storing objects in
a database. This framework also makes it possible to change the database server.
The current implementation uses MariaDB, an open-source fork of MySQL.

To communicate with the evaluation backend, we need to use ZeroMQ. This
functionality is provided by the `php_zmq` plugin that is shipped with most PHP
distributions.

### Data model

We decided to use a code-first approach when designing our data model. This
approach is greatly aided by the Doctrine 2 ORM framework, which works with
entities -- PHP classes for which we specify which attributes should be
persisted in a database. The database schema is generated from the entity
classes. This way, the exact details of how our data is stored is a secondary
concern for us and we can focus on the implementation of the business logic
instead.

The rest of this section is a description of our data model and how it relates
to the real world. All entities are stored in the `App\Model\Entity` namespace.
There are repository classes that are used to work with entities without calling
the Doctrine `EntityManager` directly. These are in the `App\Model\Repository`
namespace.

#### User Account Management

The `User` entity class contains data about users registered in ReCodEx. To
allow extending the system with additional authentication methods, login details
are stored in separate entities. There is the `Login` entity class which
contains a user name and password for our internal authentication system, and
the `ExternalLogin` entity class, which contains an identifier for an external
login service such as LDAP. Currently, each user can only have a single
authentication method (account type). The entity with login information is
created along with the `User` entity when a user signs up. If a user requests a
password reset, a `ForgottenPassword` entity is created for the request.

A user needs a way to adjust settings such as their preferred language or theme.
This is the purpose of the `UserSettings` entity class. Each possible option has
its own attribute (database column). Current supported options are `darkTheme`,
`defaultLanguage` and `vimMode`

Every user has a role in the system. The basic ones are student, supervisor and
administrator, but new roles can be created by adding `Role` entities. Roles can
have permissions associated with them. These associations are represented by
`Permission` entities. Each permission consists of a role, resource, action and
an `isAllowed` flag. If the `isAllowed` flag is set to true, the permission is
positive (lets the role access the resource), and if it is false, it denies
access. The `Resource` entity contains just a string identifier of a resource
(e.g., group, user, exercise). Action is another string that describes what the
permission allows or denies for the role and resource (e.g., edit, delete,
view).

The `Role` entity can be associated with a parent entity. If this is the case,
the role inherits all the permissions of its parent.

All actions done by a user are logged using the `UserAction` entity for
debugging purposes.

#### Instances and Groups

Users of ReCodEx are divided into groups that correspond to school lab groups
for a single course. Each group has a textual name and description. It can have
a parent group so that it is possible to create tree hierarchies of groups.

Group membership is realized using the `GroupMembership` entity class. It is a
joining entity for the `Group` and `User` entities, but it also contains
additional information, most importantly `type`, which helps to distinguish
students from group supervisors.

Groups are organized into instances. Every `Instance` entity corresponds to an
organization that uses the ReCodEx installation, for example a university or a
company that organizes programming workshops. Every user and group belong to
exactly one instance (users choose an instance when they create their account).

Every instance can be associated with multiple `Licence` entities. Licences are
used to determine whether an instance can be currently used (access to those
without a valid instance will be denied). They can correspond to billing periods
if needed.

#### Exercises

The `Exercise` entity class is used to represent exercises -- programming tasks
that can be assigned to student groups. It contains data that does not relate to
this "concrete" assignment, such as the name, version and a private description.

Some exercise descriptions need to be translated into multiple languages.
Because of this, the `Exercise` entity is associated with the
`LocalizedText` entity, one for each translation of the text.

An exercise can support multiple programming runtime environments. These
environments are represented by `RuntimeEnvironment` entities. Apart from a name
and description, they contain details of the language and operating system that
is being used. There is also a list of extensions that is used for detecting
which environment should be used for student submissions.

`RuntimeEnvironment` entities are not linked directly to exercises. Instead,
the `Exercise` entity has an M:N relation with the `RuntimeConfig` entity,
which is associated with `RuntimeEnvironment`. It also contains a path to a job
configuration file template that will be used to create a job configuration file
for the worker that processes solutions of the exercise.

Resource limits are stored outside the database, in the job configuration file
template.

#### Reference Solutions

To make setting resource limits objectively possible for a potentially diverse
set of worker machines, there should be multiple reference solutions for every
exercise in all supported languages that can be used to measure resource usage
of different approaches to the problem on various hardware and platforms.

Reference solutions are contained in `ReferenceSolution` entities. These
entities can have multiple `ReferenceSolutionEvaluation` entities associated
with them that link to evaluation results (`SolutionEvaluation` entity). Details
of this structure will be described in the section about student solutions.

Source codes of the reference solutions can be accessed using the `Solution`
entity associated with `ReferenceSolution`. This entity is also used for student
submissions.

#### Assignments

The `Assignment` entity is created from an `Exercise` entity when an exercise is
assigned to a group. Most details of the exercise can be overwritten (see the
reference documentation for a detailed overview). Additional information such as
deadlines or point values for individual tests is also configured for the
assignment and not for an exercise.

Assignments can also have their own `LocalizedText` entities. If the
assignment texts are not changed, they are shared between the exercise and its
assignment.

Runtime configurations can be also changed for the assignment. This way, a
supervisor can for example alter the resource limits for the tests. They could
also alter the way submissions are evaluated, which is discouraged.

#### Student Solutions

Solutions submitted by students are represented by the `Submission` entity. It
contains data such as when and by whom was the solution submitted. There is also
a timestamp, a note for the supervisor and an url of the location where
evaluation results should be stored.

However, the most important part of a submission are the source files. These are
stored using the `SolutionFile` entity and they can be accessed through the
`Solution` entity, which is associated with `Submission`.

When the evaluation is finished, the results are stored using the
`SolutionEvaluation` entity. This entity can have multiple `TestResult` entities
associated with it, which describe the result of a test and also contain
additional information for failing tests (such as which limits were exceeded).
Every `TestResult` can contain multiple `TaskResult` entities that provide
details about the results of individual tasks. This reflects the fact that
"tests" are just logical groups of tasks.

#### Comment Threads

The `Comment` entity contains the author of the comment, a date and the text of
the comment. In addition to this, there is a `CommentThread` entity associated
with it that groups comments on a single entity (such as a student submission).
This enables easily adding support for comments to various entities -- it is
enough to add an association with the `CommentThread` entity. An even simpler
way is to just use the identifier of the commented entity as the identifier of
the comment thread, which is how submission comments are implemented.

#### Uploaded Files

Uploaded files are stored directly on the filesystem instead of in the database.
The `UploadedFile` entity is used to store their metadata. This entity is
extended by `SolutionFile` and `ExerciseFile` using the Single Table Inheritance
pattern provided by Doctrine. Thanks to this, we can access all files uploaded
on the API through the same repository while also having data related to e.g.,
supplementary exercise files present only in related objects.

### Request Handling

A typical scenario for handling an API request is matching the HTTP request with
a corresponding handler routine which creates a response object, that is then
sent back to the client, encoded with JSON. The `Nette\Application` package can
be used to achieve this with Nette, although it is meant to be used mainly in
MVP applications.

Matching HTTP requests with handlers can be done using standard Nette URL
routing -- we will create a Nette route for each API endpoint. Using the routing
mechanism from Nette logically leads to implementing handler routines as Nette
Presenter actions. Each presenter should serve logically related endpoints.

The last step is encoding the response as JSON. In `Nette\Application`, HTTP
responses are returned using the `Presenter::sendResponse()` method. We decided
to write a method that calls `sendResponse` internally and takes care of the
encoding. This method has to be called in every presenter action. An alternative
approach would be using the internal payload object of the presenter, which is
more convenient, but provides us with less control.

### Authentication

Instead of relying on PHP sessions, we decided to use an authentication flow
based on JWT tokens (RFC 7519). On successful login, the user is issued an
access token that they have to send with subsequent requests using the HTTP
Authorization header (Authorization: Bearer <token>). The token has a limited
validity period and has to be renewed periodically using a dedicated API
endpoint.

To implement this behavior in Nette framework, a new IUserStorage implementation
was created (`App\Security\UserStorage`), along with an IIdentity and
authenticators for both our internal login service and CAS. The authenticators
are not registered in the DI container, they are invoked directly instead. On
successful authentication, the returned `App\Security\Identity` object is stored
using the `Nette\Security\User::login()` method. The user storage service works
with the http request to extract the access token if possible.

The logic of issuing tokens is contained in the `App\Security\AccessManager`
class. Internally, it uses the Firebase JWT library.

The authentication flow is contained in the `LoginPresenter` class, which serves
the `/login` endpoint group.

An advantage of this approach is being able control the authentication process
completely instead of just receiving session data through a global variable.

### Accessing Endpoints

The REST API has a [generated documentation](https://recodex.github.io/api/)
describing detailed format of input values as well as response structures
including samples.

Knowing the exact format of the endpoints allows interacting directly with the
API using any REST client available, for example `curl` or `Postman` Chrome
extension. However, there is a generated [REST
client](https://recodex.github.io/api/ui.html) directly for the ReCodEx API
structure using Swagger UI tool. For each endpoint there is a form with boxes
for all the input parameters including description and data type. The responses
are shown as highlighted JSON. The authorization can be set for whole session at
once using "Authorize" button at the top of the page.

### Permissions

In a system storing user data has to be implemented some kind of permission
checking. Each user has a role, which corresponds to his/her privileges.
Our research showed, that three roles are sufficient -- student, supervisor
and administrator. The user role has to be
checked with every request. The good points is, that roles nicely match with
granularity of API endpoints, so the permission checking can be done at the
beginning of each request. That is implemented using PHP annotations, which
allows to specify allowed user roles for each request with very little of code,
but all the business logic is the same, together in one place.

However, roles cannot cover all cases. For example, if user is a supervisor, it
relates only to groups, where he/she is a supervisor. But using only roles
allows him/her to act as supervisor in all groups in the system. Unfortunately,
this cannot be easily fixed using some annotations, because there are many
different cases when this problem occurs. To fix that, some additional checks
can be performed at the beginning of request processing. Usually it is only one
or two simple conditions.

With this two concepts together it is possible to easily cover all cases of
permission checking with quite a small amount of code.

### Uploading Files

There are two cases when users need to upload files using the API -- submitting
solutions to an assignment and creating a new exercise. In both of these cases,
the final destination of the files is the fileserver. However, the fileserver is
not publicly accessible, so the files have to be uploaded through the API.

Each file is uploaded separately and is given a unique ID. The uploaded file
can then be attached to an exercise or a submitted solution of an exercise.
Storing and removing files from the server is done through the
`App\Helpers\UploadedFileStorage` class which maps the files to their records
in the database using the `App\Model\Entity\UploadedFile` entity.

### Forgotten Password

When user finds out that he/she does not remember a password, he/she requests a
password reset and fills in his/her unique email. A temporary access token is
generated for the user corresponding to the given email address and sent to this
address encoded in a URL leading to a client application. User then goes
to the URL and can choose a new password.

The temporary token is generated and emailed by the
`App\Helpers\ForgottenPasswordHelper` class which is registered as a service
and can be injected into any presenter.

This solution is quite safe and user can handle it on its own, so administrator
does not have to worry about it.

### Job Configuration Parsing and Modifying

Even in the API the job configuration file can be loaded in the corresponding
internal structures. This is necessary because there has to be possibility to
modify particular job details, such as the job identification or the fileserver
address, during the submission.

Whole codebase concerning the job configuration is present in the
`App\Helpers\JobConfig` namespace. Job configuration is represented by the
`JobConfig` class which directly contains structures like `SubmissionHeader` or
`Tasks\Task` and indirectly `SandboxConfig` or `JobId` and more. All these
classes have parameterless constructor which should set all values to their
defaults or construct appropriate classes.

Modifying of values in the configuration classes is possible through *fluent
interfaces* and *setters*. Getting of values is also possible and all setters
should have *get* counterparts. Job configuration is serialized through
`__toString()` methods.

For loading of the job configuration there is separate `Storage` class which can
be used for loading, saving or archiving of job configuration. For parsing the
storage uses the `Loader` class which does all the checks and loads the data
from given strings in the appropriate structures. In case of parser error
`App\Exceptions\JobConfigLoadingException` is thrown.

Worth mentioning is also `App\Helpers\UploadedJobConfigStorage` class which
takes care of where the uploaded job configuration files should be saved on the
API filesystem. It can also be used for copying all job configurations during
assignment of exercise.

### Solution Loading

When a solution evaluation is finished by the backend, the results are saved to
the fileserver and the API is notified by the broker. The results are parsed and
stored in the database.

For the results of the evaluations of the reference solutions and for the
asynchronously evaluated solutions of the students (e.g., resubmitted by
the administrator) the result is processed
right after the notification from backend is received and the author of the
solution will be notified by an email after the results are processed.

When a student submits his/her solution directly through the client application
we do not parse the results right away but we postpone this until the student
(or a supervisor) wants to display the results for the first time. This may save
save some resources when the solution results are not important (e.g., the
student finds a bug in his solution before the submission has been evaluated).

#### Parsing of The Results

The results are stored in a YAML file. We map the contents of the file to the
classes of the `App\Helpers\EvaluationResults` namespace. This process
validates the file and gives us access to all of the information through
an interface of a class and not only using associative arrays. This is very
similar to how the job configuration files are processed.


## Web Application

The whole project is written using the next generation of JavaScript referred to
as *ECMAScript 6* (also known as *ES6*, *ES.next*, or *Harmony*). Since not all
of the features introduced in this standard are implemented in the modern
web browsers of today (like classes and the spread operator) and hardly any are
implemented in the older versions of the web browsers which are currently still
in use, the source code is transpiled into the older standard *ES5* using
[Babel.js](https://babeljs.io/) transpiler and bundled into a single script file
using the [webpack](https://webpack.github.io/) moudle bundler. The need for a
transpiler also arises from the usage of the *JSX* syntax for declaring React
components. To read more about these these tools and their usage please refer to
the [installation documentation](#Installation). The whole bundling process
takes place at deployment and is not repeated afterwards when running in
production.

### State Management

Web application is a SPA (Single Page Application. When the user accesses the
page, the source codes are downloaded and are interpreted by the web browser.
The communication between the browser and the server then runs in the background
without reloading the page.

The application keeps its internal state which can be altered by the actions of
the user (e.g., clicking on links and buttons, filling input fields of forms)
and by the outcomes of HTTP requests to the API server. This internal state is
kept in memory of the web browser and is not persisted in any way -- when the
page is refreshed, the internal state is deleted and a new one is created from
scratch (i.e., all of the data is fetched from the API server again).

The only part of the state which is persisted is the token of the logged in
user. This token is kept in cookies and in the local storage. Keeping the token
in the cookies is necessary for server-side rendering.

#### Redux

The in-memory state is handled by the *redux* library. This library is strongly
inspired by the [Flux](https://facebook.github.io/flux/) architecture but it has
some specifics.  The whole state is in a single serializable tree structure
called the *store*. This store can be modified only by dispatching *actions*
which are Plain Old JavaScript Oblects (POJO) which are processed by *reducers*.
A reducer is a pure function which takes the state object and the action object
and it creates a new state. This process is very easy to reason about and is
also very easy to test using unit tests. Please read the [redux
documentation](http://redux.js.org/) for detailed information about the library.

![Redux state handling schema](https://github.com/ReCodEx/wiki/raw/master/images/redux.png)

The main difference between *Flux* and *redux* is the fact that there is only
one store with one reducer in redux. The single reducer might be composed from
several silmple reducers which might be composed from other simple reducers as
well, therefore the single reducer of the store is often refered to as the root
reducer. Each of the simple reducers receives all the dispatched actions and it
decide which actions it will process and which it will ignore based on the
*type* of the action. The simple reducers can change only a specific subtree of
 the whole state tree and these subtrees do not overlap.

##### Redux Middleware

A middleware in redux is a function which can process actions before they are
passed to the reducers to update the state.

The middleware used by the ReCodEx store is defined in the `src/redux/store.js`
script.  Several open source libraries are used:

- [redux-promise-middleware](https://github.com/pburtchaell/redux-promise-middleware)
- [redux-thunk](https://github.com/gaearon/redux-thunk)
- [react-router-redux](https://github.com/reactjs/react-router-redux)

We created two other custom middleware functions for our needs:

- **API middleware** -- The middleware filters out all actions with the *type*
  set to `recodex-api/CALL`, sends a real HTTP request according to the
  information in the action.
- **Access Token Middleware** -- This middleware persists the access token each
  time after the user signs into the application into the local storage and the
  cookies. The token is removed when the user decides to sign out. The
  middleware also attaches the token to each `recodex-api/CALL` action when it
  does not have an access token set explicitly.

##### Accessing The Store Using Selectors

The components of the application are connected to the redux store using a
higher order function `connect` from the *react-redux* binding library. This
connection ensures that the react components will re-render every time some of
the specified subtrees of the main state changes.

The specific subtrees of interest are defined for every connection. These
definitions called *selectors* and they are are simple pure functions which take
the state and return its subtree. To avoid unnecesary re-renders and selections
a small library called [reselect](https://github.com/reactjs/reselect) is used.
This library allows us to compose the selectors in a similar way the reducers
are composed and therefore simply reflect the structure of the whole state
tree. The selectors for each reducer are stored in a separate file in the
`src/redux/selectors` directory.

#### Routing

The page should not be reloaded after the initial render but the current
location of the user in the system must be reflected in the URL. This is
achieved through the
[react-router](https://github.com/ReactTraining/react-router) and
[react-router-redux](https://github.com/reactjs/react-router-redux) libraries.
These libraries use `pushState` method of the `history` object, a living
standard supported by all of the modern browsers.  The mapping of the URLs to
the components is defined in the `src/pages/routes.js` file.  To create links
between pages, use either the `Link` component from the `react-router` library
or dispatch an action created using the `push` action creator from the
`react-router-redux` library. All the navigations are mapped to redux actions
and can be handled by any reducer.

Having up-to-date URLs gives the users the possibility to reload the page if
some error occurs on the page and land at the same page as he or she would
expect. Users can also send links to the very page they want to.

### Creating HTTP Requests

All of the HTTP requests are made by dispatching a specific action which will be
processed by our custom *API middleware*. The action must have the *type*
property set to `recodex-api/CALL`. The middleware catches the action and it
sends a real HTTP request created according to the information in the `request`
property of the action:

- **type** -- Type prefix of the actions which will be dispatched automatically
  during the lifecycle of the request (pending, fulfilled, failed).
- **endpoint** -- The URI to which the request should be sent. All endpoints
  will be prefixed with the base URL of the API server.
- **method** (*optional*) -- A string containing the name of the HTTP method
  which should be used.  The default method is `GET`.
- **query** (*optional*) -- An object containing key-value pairs which will be
  put in the URL of the request in the query part of the URL.
- **headers** (*optional*) -- An object containing key-value pairs which will be
  appended to the headers of the HTTP request.
- **accessToken** (*optional*) -- Explicitly set the access token for the
  request. The token will be put in the *Authorization* header.
- **body** (*optional*) -- An object or an array which will be recursively
  flattened into the `FormData` structure with correct usage of square brackets
  for nested (associative) arrays. It is worth mentioning that the keys must not
  contain a colon in the string.
- **doNotProcess** (*optional*) -- A boolean value which can disable the default
  processing of the response to the request which includes showing a
  notification to the user in case of a failure of the request. All requests
  are processed in the way described above by default.

The HTTP requests are sent using the `fetch` API which returns a *Promise* of
the request.  This promise is put into a new action and creates a new action
containing the promise and the type specified in the `request` description. This
action is then caught by the promise middleware and the promise middleware
dispatches actions whenever the state of the promise changes during its the
lifecycle. The new actions have specific types:

- {$TYPE}_PENDING -- Dispatched immediately after the action is processed by
  the promise middleware. The `payload` property of the action contains the body
  of the request.
- {$TYPE}_FAILED -- Dispatched if the promise of the request is rejected.
- {$TYPE}_FULFILLED -- Dispatched when the response to the request is received
  and the promise is resolved. The `payload` property of the action contains the
  body of the HTTP response parsed as JSON.

### Routine CRUD Operations

For routine CRUD (Create, Read, Update, Delete) operations which are common to
most of the resources used in the ReCodEx (e.g., groups, users, assignments,
solutions, solution evaluations, source code files) a set of functions called
*Resource manager* was implemented. It contains a factory which creates basic
 actions (e.g., `fetchResource`, `addResource`, `updateResource`,
 `removeResource`, `fetchMany`) and handlers for all of the lifecycle actions
 created by both the API middleware and the promise middleware which can be used
 to create a basic reducer.

The *resource manager* is spread over several files in the
`src/redux/helpers/resourceManager` directory and is covered with unit tests in
scripts located at `test/redux/helpers/resourceManager`.

### Server-side Rendering

To speed-up the initial time of rendering of the web application a technique
called server-side rendering (SSR) is used. The same code which is executed in
the web browser of the client can run on the server using
[Node.js](https://nodejs.org). React can serialize its HTML output into a string
which can be sent to the client and can be displayed before the (potentially
large) JavaScript source code starts being executed by the browser. The redux
store is in fact just a large JSON tree which can be easily serialized as well.

If the user is logged in then the access token should be in the cookies of the
web browser and it should be attached to the HTTP request when the user
navigates to the ReCodEx web page. This token is then put into the redux store
and so the user is logged in on the server.

The whole logic of the SSR is in a single file called `src/server.js`. It
contains only a definition of a simple HTTP server (using the
[express](http://expressjs.com/) framework) and some necessary boilerplate of
the routing library.

All the components which are associated to the matched route can have a class
property `loadAsync` which should contain a function returning a *Promise*. The
SRR calls all these functions and delays the response of the HTTP server until
all of the promises are resolved (or some of them fails).

### Localization and globalization

The whole application is prepared for localization and globalization. All of the
translatable texts can be extracted from the user interface and translated
into several languages. The numbers, dates, and time values are also formatted
with respect to the selected language. The
[react-intl](https://github.com/yahoo/react-intl) and
[Moment.js](http://momentjs.com/) libraries are used to achieve this.
All the strings can be extracted from the application using a command:

```
$ npm run exportStrings
```

This will create JSON files with the exported strings for the 'en' and 'cs'
locale. If you want to export strings for more languages, you must edit the
`/manageTranslations.js` script. The exported strings are placed in the
`/src/locales` directory.

## Communication Protocol

Detailed communication inside the ReCodEx system is captured in the following
image and described in sections below. Red connections are through ZeroMQ
sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ
messages are sent as multipart with one string (command, option) per part, with
no empty frames (unless explicitly specified otherwise).

![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)


### Broker - Worker Communication

Broker acts as server when communicating with worker. Listening IP address and
port are configurable, protocol family is TCP. Worker socket is of DEALER type,
broker one is ROUTER type. Because of that, very first part of every (multipart)
message from broker to worker must be target the socket identity of the worker
(which is saved on its **init** command).

#### Commands from Broker to Worker:

- **eval** -- evaluate a job. Requires 3 message frames:
    - `job_id` -- identifier of the job (in ASCII representation -- we avoid
      endianness issues and also support alphabetic ids)
    - `job_url` -- URL of the archive with job configuration and submitted
      source code
    - `result_url` -- URL where the results should be stored after evaluation
- **intro** -- introduce yourself to the broker (with **init** command) -- this
  is required when the broker loses track of the worker who sent the command.
  Possible reasons for such event are e.g. that one of the communicating sides
  shut down and restarted without the other side noticing.
- **pong** -- reply to **ping** command, no arguments

#### Commands from Worker to Broker:

- **init** -- introduce self to the broker. Useful on startup or after
  reestablishing lost connection. Requires at least 2 arguments:
    - `hwgroup` -- hardware group of this worker
    - `header` -- additional header describing worker capabilities. Format must
      be `header_name=value`, every header shall be in a separate message frame.
      There is no limit on number of headers. There is also an optional third
      argument -- additional information. If present, it should be separated
      from the headers with an empty frame. The format is the same as headers.
      Supported keys for additional information are:
    - `description` -- a human readable description of the worker for
      administrators (it will show up in broker logs)
    - `current_job` -- an identifier of a job the worker is now processing. This
      is useful when we are reassembling a connection to the broker and need it
      to know the worker will not accept a new job.
- **done** -- notifying of finished job. Contains following message frames:
    - `job_id` -- identifier of finished job
    - `result` -- response result, possible values are:
	- OK -- evaluation finished successfully
	- FAILED -- job failed and cannot be reassigned to another worker (e.g.
	  due to error in configuration)
	- INTERNAL_ERROR -- job failed due to internal worker error, but another
	  worker might be able to process it (e.g. downloading a file failed)
    - `message` -- a human readable error message
- **progress** -- notice about current evaluation progress. Contains following
  message frames:
    - `job_id` -- identifier of current job
    - `command` -- what is happening now.
	- DOWNLOADED -- submission successfully fetched from fileserver
	- FAILED -- something bad happened and job was not executed at all
	- UPLOADED -- results are uploaded to fileserver
	- STARTED -- evaluation of tasks started
	- ENDED -- evaluation of tasks is finished
	- ABORTED -- evaluation of job encountered internal error, job will be
	  rescheduled to another worker
	- FINISHED -- whole execution is finished and worker ready for another
	  job execution
	- TASK -- task state changed -- see below
    - `task_id` -- only present for "TASK" state -- identifier of task in
      current job
    - `task_state` -- only present for "TASK" state -- result of task
      evaluation. One of:
	- COMPLETED -- task was successfully executed without any error,
	  subsequent task will be executed
	- FAILED -- task ended up with some error, subsequent task will be
	  skipped
	- SKIPPED -- some of the previous dependencies failed to execute, so
	  this task will not be executed at all
- **ping** -- tell broker I am alive, no arguments


#### Heartbeating

It is important for the broker and workers to know if the other side is still
working (and connected). This is achieved with a simple heartbeating protocol.

The protocol requires the workers to send a **ping** command regularly (the
interval is configurable on both sides -- future releases might let the worker
send its ping interval with the **init** command). Upon receiving a **ping**
command, the broker responds with **pong**.

Whenever a heartbeating message doesn't arrive, a counter called _liveness_ is
decreased. When this counter drops to zero, the other side is considered
disconnected. When a message arrives, the liveness counter is set back to its
maximum value, which is configurable for both sides.

When the broker decides a worker disconnected, it tries to reschedule its jobs
to other workers.

If a worker thinks the broker crashed, it tries to reconnect periodically, with
a bounded, exponentially increasing delay.

This protocol proved great robustness in real world testing. Thus whole backend
is reliable and can outlive short term issues with connection without problems.
Also, increasing delay of ping messages does not flood the network when there
are problems. We experienced no issues since we are using this protocol.

### Worker - Fileserver Communication

Worker is communicating with file server only from _execution thread_. Supported
protocol is HTTP optionally with SSL encryption (**recommended**). If supported
by server and used version of libcurl, HTTP/2 standard is also available. File
server should be set up to require basic HTTP authentication and worker is
capable to send corresponding credentials with each request.

#### Worker Side

Workers communicate with the file server in both directions -- they download the
submissions of the student and then upload evaluation results. Internally,
worker is using libcurl C library with very similar setup. In both cases it can
verify HTTPS certificate (on Linux against system cert list, on Windows against
downloaded one from CURL website during installation), support basic HTTP
authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error
(returned HTTP status code is >=400). Worker have list of credentials to all
available file servers in its config file.

- download file -- standard HTTP GET request to given URL expecting file content
  as response
- upload file -- standard HTTP PUT request to given URL with file data as body
  -- same as command line tool `curl` with option `--upload-file`

#### File server side

File server has its own internal directory structure, where all the files are
stored. It provides simple REST API to get them or create new ones. File server
does not provide authentication or secured connection by itself, but it is
supposed to run file server as WSGI script inside a web server (like Apache)
with proper configuration. Relevant commands for communication with workers:

- **GET /submission_archives/\<id\>.\<ext\>** -- gets an archive with submitted
  source code and corresponding configuration of this job evaluation
- **GET /exercises/\<hash\>** -- gets a file, common usage is for input files or
  reference result files
- **PUT /results/\<id\>.\<ext\>** -- upload archive with evaluation results
  under specified name (should be same _id_ as name of submission archive). On
  successful upload returns JSON `{ "result": "OK" }` as body of returned page.

If not specified otherwise, `zip` format of archives is used. Symbol `/` in API
description is root of the domain of the file server. If the domain is for
example `fs.recodex.org` with SSL support, getting input file for one task could
look as GET request to
`https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c`.


### Broker - Monitor Communication

Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
socket is same on both sides, ROUTER. Monitor is set to act as server in this
communication, its IP address and port are configurable in the config of the
monitor file. ZeroMQ socket ID (set on the side of the monitor) is
"recodex-monitor" and must be sent as first frame of every multipart message --
see ZeroMQ ROUTER socket documentation for more info.

Note that the monitor is designed so that it can receive data both from the
broker and workers. The current architecture prefers the broker to do all the
communication so that the workers do not have to know too many network services.

Monitor is treated as a somewhat optional part of whole solution, so no special
effort on communication reliability was made.

#### Commands from Monitor to Broker:

Because there is no need for the monitor to communicate with the broker, there
are no commands so far. Any message from monitor to broker is logged and
discarded.

#### Commands from Broker to Monitor:

- **progress** -- notification about progress with job evaluation. This
  communication is usually redirected as is from worker (the exception being
  cases when the worker might have crashed without sending a failure notice),
  more info can be found in "Broker - Worker Communication" chapter above.

### Broker - REST API Communication

Broker communicates with main REST API through ZeroMQ connection over TCP.
Socket type on broker side is ROUTER, on frontend part it is DEALER. Broker acts
as a server, its IP address and port is configurable in the API.

#### Commands from API to Broker:

- **eval** -- evaluate a job. Requires at least 4 frames:
    - `job_id` -- identifier of this job (in ASCII representation -- we avoid
      endianness issues and also support alphabetic ids)
    - `header` -- additional header describing worker capabilities. Format must
      be `header_name=value`, every header shall be in a separate message frame.
      There is no maximum limit on number of headers. There may be also no
      headers at all. A worker is considered suitable for the job if and only if
      it satisfies all of its headers.
    - empty frame -- frame which contains only empty string and serves only as
      breakpoint after headers
    - `job_url` -- URI location of archive with job configuration and submitted
      source code
    - `result_url` -- remote URI where results will be pushed to

#### Commands from Broker to API:

All are responses to **eval** command.

- **ack** -- this is first message which is sent back to frontend right after
  eval command arrives, basically it means "Hi, I am all right and am capable of
  receiving job requests", after sending this broker will try to find acceptable
  worker for arrived request
- **accept** -- broker is capable of routing request to a worker
- **reject** -- broker cannot handle this job (for example when the requirements
  specified by the headers cannot be met). There are (rare) cases when the
  broker finds that it cannot handle the job after it was confirmed. In such
  cases it uses the frontend REST API to mark the job as failed.

#### Asynchronous Communication Between Broker And API

Only a fraction of the errors that can happen during evaluation can be detected
while there is a ZeroMQ connection between the API and broker. To notify the
frontend of the rest, the API exposes an endpoint for the broker for this
purpose. Broker uses this endpoint whenever the status of a job changes
(it is finished, it failed permanently, the only worker capable of processing
it disconnected...).

When a request for sending a report arrives from the backend then the type of
the report is inferred and if it is an error which deserves attention of the
administrator then an email is sent to him/her. There can also be errors which
are not that important (e.g., it was somehow solved by the backend itself or it
is only informative), then these do not have to be reported through an email but
they are stored in the persistent database for further consideration.

For the details of this interface please refer to the attached API documentation
and the `broker-reports/` endpoint group.

### Fileserver - REST API Communication

File server has a REST API for interaction with other parts of ReCodEx.
Description of communication with workers is in "Worker - Fileserver
Communication" chapter above. On top of that, there are other commands for
interaction with the API:

- **GET /results/\<id\>.\<ext\>** -- download archive with evaluated results of
  job _id_
- **POST /submissions/\<id\>** -- upload new submission with identifier _id_.
  Expects that the body of the POST request uses file paths as keys and the
  content of the files as values. On successful upload returns JSON `{
  "archive_path": <archive_url>, "result_path": <result_url> }` in response
  body. From _archive_path_ the submission can be downloaded (by worker) and
  corresponding evaluation results should be uploaded to _result_path_.
- **POST /tasks** -- upload new files, which will be available by names equal to
  `sha1sum` of their content. There can be uploaded more files at once. On
  successful upload returns JSON `{ "result": "OK", "files": <file_list> }` in
  response body, where _file_list_ is dictionary of original file name as key
  and new URL with already hashed name as value.

There are no plans yet to support deleting files from this API. This may change
in time.

REST API calls these fileserver endpoints with standard HTTP requests. There are
no special commands involved. There is no communication in opposite direction.

### Monitor - Web App Communication

Monitor interacts with web application through WebSocket connection. Monitor
acts as server and browsers are connecting to it. IP address and port are
configurable. When client connects to the monitor, it sends a message with
string representation of channel id (which messages are interested in, usually
id of evaluating job). There can be multiple listeners per channel, even
(shortly) delayed connections will receive all messages from the very beginning.

When monitor receives **progress** message from broker there are two options:

- there is no WebSocket connection for listed channel (job id) -- message is
  dropped
- there is active WebSocket connection for listed channel -- message is parsed
  into JSON format (see below) and send as string to that established channel.
  Messages for active connections are queued, so no messages are discarded even
  on heavy workload.

Message from monitor to web application is in JSON format and it has form of
dictionary (associative array). Information contained in this message should
correspond with the ones given by worker to broker. For further description
please read more in "Broker - Worker communication" chapter under "progress"
command.

Message format:

- **command** -- type of progress, one of: DOWNLOADED, FAILED, UPLOADED,
  STARTED, ENDED, ABORTED, FINISHED, TASK
- **task_id** -- id of currently evaluated task. Present only if **command** is
  "TASK".
- **task_state** -- state of task with id **task_id**. Present only if
  **command** is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".

### Web App - REST API Communication

The provided web application runs as a JavaScript process inside the browser of
the user. It communicates with the REST API on the server through the standard
HTTP requests. Documentation of the main REST API is in a separate
[document](https://recodex.github.io/api/) due to its extensiveness. The results
are returned encoded in JSON which is simply processed by the web application
and presented to the user in an appropriate way.


<!---
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
-->