|
|
# Implementation
|
|
|
|
|
|
## Broker
|
|
|
|
|
|
The broker is a central part of the ReCodEx backend that directs most of the
|
|
|
communication. It was designed to maintain a heavy load of messages by making
|
|
|
only small actions in the main communication thread and asynchronous execution
|
|
|
of other actions.
|
|
|
|
|
|
The responsibilities of broker are:
|
|
|
|
|
|
- allowing workers to register themselves and keep track of their capabilities
|
|
|
- tracking status of each worker and handle cases when they crash
|
|
|
- accepting assignment evaluation requests from the frontend and forwarding them
|
|
|
to workers
|
|
|
- receiving a job status information from workers and forward them to the
|
|
|
frontend either via monitor or REST API
|
|
|
- notifying the frontend on errors of the backend
|
|
|
|
|
|
### Internal Structure
|
|
|
|
|
|
The main work of the broker is to handle incoming messages. For that a _reactor_
|
|
|
subcomponent is written to bind events on sockets to handler classes. There are
|
|
|
currently two handlers -- one that handles the main functionality and the other
|
|
|
that sends status reports to the REST API asynchronously. This prevents broker
|
|
|
freezes when synchronously waiting for responses of HTTP requests, especially
|
|
|
when some kind of error happens on the server.
|
|
|
|
|
|
Main handler takes care of requests from workers and API servers:
|
|
|
|
|
|
- *init* -- initial connection from worker to broker
|
|
|
- *done* -- currently processed job on worker was executed and is done
|
|
|
- *ping* -- worker proving that it is still alive
|
|
|
- *progress* -- job progress state from worker which is immediately forwarded
|
|
|
to monitor
|
|
|
- *eval* -- request from API server to execute given job
|
|
|
|
|
|
Second handler is asynchronous status notifier which is able to execute HTTP
|
|
|
requests. This notifier is used on error reporting from backend to frontend API.
|
|
|
|
|
|
#### Worker Registry
|
|
|
|
|
|
The `worker_registry` class is used to store information about workers, their
|
|
|
status and the jobs in their queue. It can look up a worker using the headers
|
|
|
received with a request (a worker is considered suitable if and only if it
|
|
|
satisfies all the job headers). The headers are arbitrary key-value pairs, which
|
|
|
are checked for equality by the broker. However, some headers require special
|
|
|
handling, namely `threads`, for which we check if the value in the request is
|
|
|
lesser than or equal to the value advertised by the worker, and `hwgroup`, for
|
|
|
which we support requesting one of multiple hardware groups by listing multiple
|
|
|
names separated with a `|` symbol (e.g. `group_1|group_2|group_3`.
|
|
|
|
|
|
The registry also implements a basic load balancing algorithm -- the workers are
|
|
|
contained in a queue and whenever one of them receives a job, it is moved to its
|
|
|
end, which makes it less likely to receive another job soon.
|
|
|
|
|
|
When a worker is assigned a job, it will not be assigned another one until a
|
|
|
`done` message is received.
|
|
|
|
|
|
#### Error Reporting
|
|
|
|
|
|
Broker is the only backend component which is able to report errors directly to
|
|
|
the REST API. Other components have to notify the broker first and it forwards
|
|
|
the messages to the API. For HTTP communication a *libcurl* library is used. To
|
|
|
address security concerns there is a *HTTP Basic Auth* configured on particular
|
|
|
API endpoints and correct credentials have to be entered.
|
|
|
|
|
|
Following types of failures are distinguished:
|
|
|
|
|
|
**Job failure** -- there are two ways a job can fail, internal and external one.
|
|
|
An internal failure is the fault of worker, for example when it
|
|
|
cannot download a file needed for the evaluation. An external
|
|
|
error is for example when the job configuration is malformed. Note that wrong
|
|
|
student solution is not considered as a job failure.
|
|
|
|
|
|
Jobs that failed internally are reassigned until a limit on the amount of
|
|
|
reassignments (configurable with the `max_request_failures` option) is reached.
|
|
|
External failures are reported to the frontend immediately.
|
|
|
|
|
|
**Worker failure** -- when a worker crash is detected, an attempt to reassign
|
|
|
its current job and also all the jobs from its queue is made. Because the
|
|
|
current job might be the reason of the crash, its reassignment is also counted
|
|
|
towards the `max_request_failures` limit (the counter is shared). If there is
|
|
|
no worker that could process a job available (i.e. it cannot be reassigned),
|
|
|
the job is reported as failed to the frontend via REST API.
|
|
|
|
|
|
**Broker failure** -- when the broker itself crashed and is restarted, workers
|
|
|
will reconnect automatically. However, all jobs in their queues are lost. If a
|
|
|
worker manages to finish a job and notifies the "new" broker, the report is
|
|
|
forwarded to the frontend. The same goes for external failures. Jobs that fail
|
|
|
internally cannot be reassigned, because the "new" broker does not know their
|
|
|
headers -- they are reported as failed immediately.
|
|
|
|
|
|
### Additional Libraries
|
|
|
|
|
|
Broker implementation depends on several open-source C and C++ libraries.
|
|
|
|
|
|
- **libcurl** -- Libcurl is used for notifying REST API on job finish event over
|
|
|
HTTP protocol. Due to lack of documentation of all C++ bindings the plain C
|
|
|
API is used.
|
|
|
- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
|
|
|
basicaly contains only one header file, but its API fits into the object
|
|
|
architecture of the broker.
|
|
|
- **spdlog** -- Spdlog is small, fast and modern logging library used for system
|
|
|
logging. It is highly customizable and configurable from the
|
|
|
configuration of the broker.
|
|
|
- **yaml-cpp** -- Yaml-cpp is used for parsing broker configuration text file in
|
|
|
YAML format.
|
|
|
- **boost-filesystem** -- Boost filesystem is used for managing logging
|
|
|
directory (create if necessary) and parsing filesystem paths from strings as
|
|
|
written in the configuration of the broker. Filesystem operations will be
|
|
|
included in future releases of C++ standard, so this dependency may be
|
|
|
removed in the future.
|
|
|
- **boost-program_options** -- Boost program options is used for
|
|
|
parsing of command line positional arguments. It is possible to use POSIX
|
|
|
`getopt` C function, but we decided to use boost, which provides nicer API and
|
|
|
is already used by worker component.
|
|
|
|
|
|
|
|
|
## Fileserver
|
|
|
|
|
|
The fileserver component provides a shared file storage between the frontend and
|
|
|
the backend. It is written in Python 3 using Flask web framework. Fileserver
|
|
|
stores files in configurable filesystem directory, provides file deduplication
|
|
|
and HTTP access. To keep the stored data safe, the fileserver should not be
|
|
|
visible from public internet. Instead, it should be accessed indirectly through
|
|
|
the REST API.
|
|
|
|
|
|
### File Deduplication
|
|
|
|
|
|
From our analysis of the requirements, it is certain we need to implement a
|
|
|
means of dealing with duplicate files.
|
|
|
|
|
|
File deduplication is implemented by storing files under the hashes of their
|
|
|
content. This procedure is done completely inside fileserver. Plain files are
|
|
|
uploaded into fileserver, hashed, saved and the new filename is returned back to
|
|
|
the uploader.
|
|
|
|
|
|
SHA1 is used as hashing function, because it is fast to compute and provides
|
|
|
reasonable collision safety for non-cryptographic purposes. Files with the same
|
|
|
hash are treated as the same, no additional checks for collisions are performed.
|
|
|
However, it is really unlikely to find one. If SHA1 proves insufficient, it is
|
|
|
possible to change the hash function to something else, because the naming
|
|
|
strategy is fully contained in the fileserver (special care must be taken to
|
|
|
maintain backward compatibility).
|
|
|
|
|
|
### Storage Structure
|
|
|
|
|
|
Fileserver stores its data in following structure:
|
|
|
|
|
|
- `./submissions/<id>/` -- folder that contains files submitted by users (the
|
|
|
solutions to the assignments of the student). `<id>` is an identifier received
|
|
|
from the REST API.
|
|
|
- `./submission_archives/<id>.zip` -- ZIP archives of all submissions. These are
|
|
|
created automatically when a submission is uploaded. `<id>` is an identifier
|
|
|
of the corresponding submission.
|
|
|
- `./exercises/<subkey>/<key>` -- supplementary exercise files (e.g. test inputs
|
|
|
and outputs). `<key>` is a hash of the file content (`sha1` is used) and
|
|
|
`<subkey>` is its first letter (this is an attempt to prevent creating a flat
|
|
|
directory structure).
|
|
|
- `./results/<id>.zip` -- ZIP archives of results for submission with `<id>`
|
|
|
identifier.
|
|
|
|
|
|
|
|
|
## Worker
|
|
|
|
|
|
The job of the worker is to securely execute a job according to its
|
|
|
configuration and upload results back for latter processing. After receiving an
|
|
|
evaluation request, worker has to do following:
|
|
|
|
|
|
- download the archive containing submitted source files and configuration file
|
|
|
- download any supplementary files based on the configuration file, such as test
|
|
|
inputs or helper programs (this is done on demand, using a `fetch` command
|
|
|
in the assignment configuration)
|
|
|
- evaluate the submission according to job configuration
|
|
|
- during evaluation progress messages can be sent back to broker
|
|
|
- upload the results of the evaluation to the fileserver
|
|
|
- notify broker that the evaluation finished
|
|
|
|
|
|
### Internal Structure
|
|
|
|
|
|
Worker is logically divided into two parts:
|
|
|
|
|
|
- **Listener** -- communicates with broker through ZeroMQ. On startup, it
|
|
|
introduces itself to the broker. Then it receives new jobs, passes them to
|
|
|
the evaluator part and sends back results and progress reports.
|
|
|
- **Evaluator** -- gets jobs from the listener part, evaluates them (possibly in
|
|
|
sandbox) and notifies the other part when the evaluation ends. Evaluator also
|
|
|
communicates with fileserver, downloads supplementary files and uploads
|
|
|
detailed results.
|
|
|
|
|
|
These parts run in separate threads of the same process and communicate through
|
|
|
ZeroMQ in-process sockets. Alternative approach would be using shared memory
|
|
|
region with unique access, but messaging is generally considered safer. Shared
|
|
|
memory has to be used very carefully because of race condition issues when
|
|
|
reading and writing concurrently. Also, messages inside worker are small, so
|
|
|
there is no big overhead copying data between threads. This multi-threaded
|
|
|
design allows the worker to keep sending `ping` messages even when it is
|
|
|
processing a job.
|
|
|
|
|
|
### Capability Identification
|
|
|
|
|
|
There are possibly multiple worker instances in a ReCodEx installation and each
|
|
|
one can run on different hardware, operating system, or have different tools
|
|
|
installed. To identify the hardware capabilities of a worker, we use the concept
|
|
|
of **hardware groups**. Each worker belongs to exactly one group that specifies
|
|
|
the hardware and operating system on which the submitted programs will be run. A
|
|
|
worker also has a set of additional properties called **headers**. Together they
|
|
|
help the broker to decide which worker is suitable for processing a job
|
|
|
evaluation request. This information is sent to the broker on worker startup.
|
|
|
|
|
|
The hardware group is a string identifier of the hardware configuration, for
|
|
|
example "i7-4560-quad-ssd-linux" configured by the administrator for each worker
|
|
|
instance. If this is done correctly, performance measurements of a submission
|
|
|
should yield the same results on all computers from the same hardware group.
|
|
|
Thanks to this fact, we can use the same resource limits on every worker in a
|
|
|
hardware group.
|
|
|
|
|
|
The headers are a set of key-value pairs that describe the worker capabilities.
|
|
|
For example, they can show which runtime environments are installed or whether
|
|
|
this worker measures time precisely. Headers are also configured manually by an
|
|
|
administrator.
|
|
|
|
|
|
### Running Student Submissions
|
|
|
|
|
|
Student submissions are executed in a sandbox environment to prevent them from
|
|
|
damaging the host system and also to restrict the amount of used resources.
|
|
|
Currently, only the Isolate sandbox support is implemented, but it is possible
|
|
|
to add support for another sandbox.
|
|
|
|
|
|
Every sandbox, regardless of the concrete implementation, has to be a command
|
|
|
line application taking parameters with arguments, standard input or file.
|
|
|
Outputs should be written to a file or to the standard output. There are no
|
|
|
other requirements, the design of the worker is very versatile and can be
|
|
|
adapted to different needs.
|
|
|
|
|
|
The sandbox part of the worker is the only one which is not portable, so
|
|
|
conditional compilation is used to include only supported parts of the project.
|
|
|
Isolate does not work on Windows environment, so also its invocation is done
|
|
|
through native calls of Linux OS (`fork`, `exec`). To disable compilation of
|
|
|
this part on Windows, the `#ifndef _WIN32` guard is used around affected files.
|
|
|
|
|
|
Isolate in particular is executed in a separate Linux process created by `fork`
|
|
|
and `exec` system calls. Communication between processes is performed through an
|
|
|
unnamed pipe with standard input and output descriptors redirection. To prevent
|
|
|
Isolate failure there is another safety guard -- whole sandbox is killed when it
|
|
|
does not end in `(time + 300) * 1.2` seconds where `time` is the original
|
|
|
maximum time allowed for the task. This formula works well both for short and
|
|
|
long tasks, but the timeout should never be reached if Isolate works properly --
|
|
|
it should always end itself in time.
|
|
|
|
|
|
### Directories and Files
|
|
|
|
|
|
During a job execution the worker has to handle several files -- input archive
|
|
|
with submitted sources and job configuration, temporary files generated during
|
|
|
execution or fetched testing inputs and outputs. For each job is created a
|
|
|
separate directory structure which is removed after finishing the job.
|
|
|
|
|
|
The files are stored in local filesystem of the worker computer in a
|
|
|
configurable location. The job is not restricted to use only specified
|
|
|
directories (tasks can do anything that is allowed by the system), but it is
|
|
|
advised not to write outside them. In addition, sandboxed tasks are usually
|
|
|
restricted to use only a specific (evaluation) directory.
|
|
|
|
|
|
The following directory structure is used for execution. The working directory
|
|
|
of the worker (root of the following paths) is shared for multiple instances on
|
|
|
the same computer.
|
|
|
|
|
|
- `downloads/${WORKER_ID}/${JOB_ID}` -- place to store the downloaded archive
|
|
|
with submitted sources and job configuration
|
|
|
- `submission/${WORKER_ID}/${JOB_ID}` -- place to store a decompressed
|
|
|
submission archive
|
|
|
- `eval/${WORKER_ID}/${JOB_ID}` -- place where all the execution should happen
|
|
|
- `temp/${WORKER_ID}/${JOB_ID}` -- place for temporary files
|
|
|
- `results/${WORKER_ID}/${JOB_ID}` -- place to store all files which will be
|
|
|
uploaded on the fileserver, usually only yaml result file and optionally log
|
|
|
file, other files have to be explicitly copied here if requested
|
|
|
|
|
|
Some of the directories are accessible during job execution from within sandbox
|
|
|
through predefined variables. List of these is described in job configuration
|
|
|
appendix.
|
|
|
|
|
|
### Judges
|
|
|
|
|
|
ReCodEx provides a few initial judges programs. They are mostly adopted from
|
|
|
CodEx and installed automatically with the worker component. Judging programs
|
|
|
have to meet some requirements. Basic ones are inspired by standard `diff`
|
|
|
application -- two mandatory positional parameters which have to be the files
|
|
|
for comparison and exit code reflecting if the result is correct (0) or wrong
|
|
|
(1).
|
|
|
|
|
|
This interface lacks support for returning additional data by the judges, for
|
|
|
example similarity of the two files calculated as the Levenshtein edit distance.
|
|
|
To allow passing these additional values an extended judge interface can be
|
|
|
implemented:
|
|
|
|
|
|
- Parameters: There are two mandatory positional parameters which have to be
|
|
|
files for comparison
|
|
|
- Results:
|
|
|
- _comparison OK_
|
|
|
- exitcode: 0
|
|
|
- stdout: there is a single line with a double value which
|
|
|
should be quality percentage of the judged file
|
|
|
- _comparison BAD_
|
|
|
- exitcode: 1
|
|
|
- stdout: can be empty
|
|
|
- _error during execution_
|
|
|
- exitcode: 2
|
|
|
- stderr: there should be description of error
|
|
|
|
|
|
The additional double value is saved to the results file and can be used for
|
|
|
score calculation in the frontend. If just the basic judge is used, the values
|
|
|
are 1.0 for exit code 0 and 0.0 for exit code 1.
|
|
|
|
|
|
If more values are needed for score computation, multiple judges can be used in
|
|
|
sequence and the values used together. However, extended judge interface should
|
|
|
comply most of possible use cases.
|
|
|
|
|
|
### Additional Libraries
|
|
|
|
|
|
Worker implementation depends on several open-source C and C++ libraries. All of
|
|
|
them are multi-platform, so both Linux and Windows builds are possible.
|
|
|
|
|
|
- **libcurl** -- Libcurl is used for all HTTP communication, that is downloading
|
|
|
and uploading files. Due to lack of documentation of all C++ bindings the
|
|
|
plain C API is used.
|
|
|
- **libarchive** -- Libarchive is used for compressing and extracting archives.
|
|
|
Actual supported formats depends on installed packages on target system, but
|
|
|
at least ZIP and TAR.GZ should be available.
|
|
|
- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
|
|
|
basicaly contains only one header file, but its API fits into the object
|
|
|
architecture of the worker.
|
|
|
- **spdlog** -- Spdlog is small, fast and modern logging library. It is used for
|
|
|
all of the logging, both system and job logs. It is highly customizable and
|
|
|
configurable from the configuration of the worker.
|
|
|
- **yaml-cpp** -- Yaml-cpp is used for parsing and creating text files in YAML
|
|
|
format. That includes the configuration of the worker, the configuration and
|
|
|
the results of a job.
|
|
|
- **boost-filesystem** -- Boost filesystem is used for multi-platform
|
|
|
manipulation with files and directories. However, these operations will be
|
|
|
included in future releases of C++ standard, so this dependency may be removed
|
|
|
in the future.
|
|
|
- **boost-program_options** -- Boost program options is used for multi-platform
|
|
|
parsing of command line positional arguments. It is not necessary to use it,
|
|
|
similar functionality can be implemented be ourselves, but this well known
|
|
|
library is effortless to use.
|
|
|
|
|
|
## Monitor
|
|
|
|
|
|
Monitor is an optional part of the ReCodEx solution for reporting progress of
|
|
|
job evaluation back to users in the real time. It is written in Python, tested
|
|
|
versions are 3.4 and 3.5. Following dependencies are used:
|
|
|
|
|
|
- **zmq** -- binding to ZeroMQ message framework
|
|
|
- **websockets** -- framework for communication over WebSockets
|
|
|
- **asyncio** -- library for fast asynchronous operations
|
|
|
- **pyyaml** -- parsing YAML configuration files
|
|
|
|
|
|
There is just one monitor instance required per broker. Also, monitor has to be
|
|
|
publicly visible (has to have public IP address or be behind public proxy
|
|
|
server) and also needs a connection to the broker. If the web application is
|
|
|
using HTTPS, it is required to use a proxy for monitor to provide encryption
|
|
|
over WebSockets. If this is not done, browsers of the users will block
|
|
|
unencrypted connection and will not show the progress to the users.
|
|
|
|
|
|
### Message Flow
|
|
|
|
|
|
![Message flow inside monitor](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
|
|
|
|
|
|
Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all
|
|
|
components (logger for example), starts the other thread and runs the ZeroMQ
|
|
|
part of the application. This thread receives and parses incoming messages from
|
|
|
broker and forwards them to _thread 2_ sending logic.
|
|
|
|
|
|
_Thread 2_ is responsible for managing all of WebSocket connections
|
|
|
asynchronously. Whole thread is one big _asyncio_ event loop through which all
|
|
|
actions are processed. None of custom data types in Python are thread-safe, so
|
|
|
all events from other threads (actually only `send_message` method invocation)
|
|
|
must be called within the event loop (via `asyncio.loop.call_soon_threadsafe`
|
|
|
function). Please note, that most of the Python interpreters use [Global
|
|
|
Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there
|
|
|
is actually no parallelism in the performance point of view, but proper
|
|
|
synchronization is still required.
|
|
|
|
|
|
### Handling of Incoming Messages
|
|
|
|
|
|
Incoming ZeroMQ progress message is received and parsed to JSON format (same as
|
|
|
our WebSocket communication format). JSON string is then passed to _thread 2_
|
|
|
for asynchronous sending. Each message has an identifier of channel where to
|
|
|
send it to.
|
|
|
|
|
|
There can be multiple receivers to one channel id. Each one has separate
|
|
|
_asyncio.Queue_ instance where new messages are added. In addition to that,
|
|
|
there is one list of all messages per channel. If a client connects a bit later
|
|
|
than the point when monitor starts to receive messages, it will receive all
|
|
|
messages from the beginning. Messages are stored 5 minutes after last progress
|
|
|
command (normally FINISHED) is received, then are permanently deleted. This
|
|
|
caching mechanism was implemented because early testing shows, that first couple
|
|
|
of messages are missed quite often.
|
|
|
|
|
|
Messages from the queue of the client are sent through corresponding WebSocket
|
|
|
connection via main event loop as soon as possible. This approach with separate
|
|
|
queue per connection is easy to implement and guarantees reliability and order
|
|
|
of message delivery.
|
|
|
|
|
|
## Cleaner
|
|
|
|
|
|
Cleaner component is tightly bound to the worker. It manages the cache folder of
|
|
|
the worker, mainly deletes outdated files. Every cleaner instance maintains one
|
|
|
cache folder, which can be used by multiple workers. This means on one server
|
|
|
there can be numerous instances of workers with the same cache folder, but there
|
|
|
should be only one cleaner instance.
|
|
|
|
|
|
Cleaner is written in Python 3 programming language, so it works well
|
|
|
multi-platform. It uses only `pyyaml` library for reading configuration file and
|
|
|
`argparse` library for processing command line arguments.
|
|
|
|
|
|
It is a simple script which checks the cache folder, possibly deletes old files
|
|
|
and then ends. This means that the cleaner has to be run repeatedly, for example
|
|
|
using cron, systemd timer or Windows task scheduler. For proper function of the
|
|
|
cleaner a suitable cronning interval has to be used. It is recommended to use
|
|
|
24 hour interval which is sufficient enough for intended usage. The value is set
|
|
|
in the configuration file of the cleaner.
|
|
|
|
|
|
## REST API
|
|
|
|
|
|
The REST API is a PHP application run in an HTTP server. Its purpose is
|
|
|
providing controlled access to the evaluation backend and storing the state of
|
|
|
the application.
|
|
|
|
|
|
### Used Technologies
|
|
|
|
|
|
We chose to use PHP in version 7.0, which was the most recent version at the
|
|
|
time of starting the project. The most notable new feature is optional static
|
|
|
typing of function parameters and return values. We use this as much as possible
|
|
|
to enable easy static analysis with tools like PHPStan. Using static analysis
|
|
|
leads to less error-prone code that does not need as many tests as code that
|
|
|
uses duck typing and relies on automatic type conversions. We aim to keep our
|
|
|
codebase compatible with new releases of PHP.
|
|
|
|
|
|
To speed up the development and to make it easier to follow best practices, we
|
|
|
decided to use the Nette framework. The framework itself is focused on creating
|
|
|
applications that render HTML output, but a lot of its features can be used in a
|
|
|
REST application, too.
|
|
|
|
|
|
Doctrine 2 ORM is used to provide a layer of abstraction over storing objects in
|
|
|
a database. This framework also makes it possible to change the database server.
|
|
|
The current implementation uses MariaDB, an open-source fork of MySQL.
|
|
|
|
|
|
To communicate with the evaluation backend, we need to use ZeroMQ. This
|
|
|
functionality is provided by the `php_zmq` plugin that is shipped with most PHP
|
|
|
distributions.
|
|
|
|
|
|
### Data model
|
|
|
|
|
|
We decided to use a code-first approach when designing our data model. This
|
|
|
approach is greatly aided by the Doctrine 2 ORM framework, which works with
|
|
|
entities -- PHP classes for which we specify which attributes should be
|
|
|
persisted in a database. The database schema is generated from the entity
|
|
|
classes. This way, the exact details of how our data is stored is a secondary
|
|
|
concern for us and we can focus on the implementation of the business logic
|
|
|
instead.
|
|
|
|
|
|
The rest of this section is a description of our data model and how it relates
|
|
|
to the real world. All entities are stored in the `App\Model\Entity` namespace.
|
|
|
There are repository classes that are used to work with entities without calling
|
|
|
the Doctrine `EntityManager` directly. These are in the `App\Model\Repository`
|
|
|
namespace.
|
|
|
|
|
|
#### User Account Management
|
|
|
|
|
|
The `User` entity class contains data about users registered in ReCodEx. To
|
|
|
allow extending the system with additional authentication methods, login details
|
|
|
are stored in separate entities. There is the `Login` entity class which
|
|
|
contains a user name and password for our internal authentication system, and
|
|
|
the `ExternalLogin` entity class, which contains an identifier for an external
|
|
|
login service such as LDAP. Currently, each user can only have a single
|
|
|
authentication method (account type). The entity with login information is
|
|
|
created along with the `User` entity when a user signs up. If a user requests a
|
|
|
password reset, a `ForgottenPassword` entity is created for the request.
|
|
|
|
|
|
A user needs a way to adjust settings such as their preferred language or theme.
|
|
|
This is the purpose of the `UserSettings` entity class. Each possible option has
|
|
|
its own attribute (database column). Current supported options are `darkTheme`,
|
|
|
`defaultLanguage` and `vimMode`
|
|
|
|
|
|
Every user has a role in the system. The basic ones are student, supervisor and
|
|
|
administrator, but new roles can be created by adding `Role` entities. Roles can
|
|
|
have permissions associated with them. These associations are represented by
|
|
|
`Permission` entities. Each permission consists of a role, resource, action and
|
|
|
an `isAllowed` flag. If the `isAllowed` flag is set to true, the permission is
|
|
|
positive (lets the role access the resource), and if it is false, it denies
|
|
|
access. The `Resource` entity contains just a string identifier of a resource
|
|
|
(e.g., group, user, exercise). Action is another string that describes what the
|
|
|
permission allows or denies for the role and resource (e.g., edit, delete,
|
|
|
view).
|
|
|
|
|
|
The `Role` entity can be associated with a parent entity. If this is the case,
|
|
|
the role inherits all the permissions of its parent.
|
|
|
|
|
|
All actions done by a user are logged using the `UserAction` entity for
|
|
|
debugging purposes.
|
|
|
|
|
|
#### Instances and Groups
|
|
|
|
|
|
Users of ReCodEx are divided into groups that correspond to school lab groups
|
|
|
for a single course. Each group has a textual name and description. It can have
|
|
|
a parent group so that it is possible to create tree hierarchies of groups.
|
|
|
|
|
|
Group membership is realized using the `GroupMembership` entity class. It is a
|
|
|
joining entity for the `Group` and `User` entities, but it also contains
|
|
|
additional information, most importantly `type`, which helps to distinguish
|
|
|
students from group supervisors.
|
|
|
|
|
|
Groups are organized into instances. Every `Instance` entity corresponds to an
|
|
|
organization that uses the ReCodEx installation, for example a university or a
|
|
|
company that organizes programming workshops. Every user and group belong to
|
|
|
exactly one instance (users choose an instance when they create their account).
|
|
|
|
|
|
Every instance can be associated with multiple `Licence` entities. Licences are
|
|
|
used to determine whether an instance can be currently used (access to those
|
|
|
without a valid instance will be denied). They can correspond to billing periods
|
|
|
if needed.
|
|
|
|
|
|
#### Exercises
|
|
|
|
|
|
The `Exercise` entity class is used to represent exercises -- programming tasks
|
|
|
that can be assigned to student groups. It contains data that does not relate to
|
|
|
this "concrete" assignment, such as the name, version and a private description.
|
|
|
|
|
|
Some exercise descriptions need to be translated into multiple languages.
|
|
|
Because of this, the `Exercise` entity is associated with the
|
|
|
`LocalizedText` entity, one for each translation of the text.
|
|
|
|
|
|
An exercise can support multiple programming runtime environments. These
|
|
|
environments are represented by `RuntimeEnvironment` entities. Apart from a name
|
|
|
and description, they contain details of the language and operating system that
|
|
|
is being used. There is also a list of extensions that is used for detecting
|
|
|
which environment should be used for student submissions.
|
|
|
|
|
|
`RuntimeEnvironment` entities are not linked directly to exercises. Instead,
|
|
|
the `Exercise` entity has an M:N relation with the `RuntimeConfig` entity,
|
|
|
which is associated with `RuntimeEnvironment`. It also contains a path to a job
|
|
|
configuration file template that will be used to create a job configuration file
|
|
|
for the worker that processes solutions of the exercise.
|
|
|
|
|
|
Resource limits are stored outside the database, in the job configuration file
|
|
|
template.
|
|
|
|
|
|
#### Reference Solutions
|
|
|
|
|
|
To make setting resource limits objectively possible for a potentially diverse
|
|
|
set of worker machines, there should be multiple reference solutions for every
|
|
|
exercise in all supported languages that can be used to measure resource usage
|
|
|
of different approaches to the problem on various hardware and platforms.
|
|
|
|
|
|
Reference solutions are contained in `ReferenceSolution` entities. These
|
|
|
entities can have multiple `ReferenceSolutionEvaluation` entities associated
|
|
|
with them that link to evaluation results (`SolutionEvaluation` entity). Details
|
|
|
of this structure will be described in the section about student solutions.
|
|
|
|
|
|
Source codes of the reference solutions can be accessed using the `Solution`
|
|
|
entity associated with `ReferenceSolution`. This entity is also used for student
|
|
|
submissions.
|
|
|
|
|
|
#### Assignments
|
|
|
|
|
|
The `Assignment` entity is created from an `Exercise` entity when an exercise is
|
|
|
assigned to a group. Most details of the exercise can be overwritten (see the
|
|
|
reference documentation for a detailed overview). Additional information such as
|
|
|
deadlines or point values for individual tests is also configured for the
|
|
|
assignment and not for an exercise.
|
|
|
|
|
|
Assignments can also have their own `LocalizedText` entities. If the
|
|
|
assignment texts are not changed, they are shared between the exercise and its
|
|
|
assignment.
|
|
|
|
|
|
Runtime configurations can be also changed for the assignment. This way, a
|
|
|
supervisor can for example alter the resource limits for the tests. They could
|
|
|
also alter the way submissions are evaluated, which is discouraged.
|
|
|
|
|
|
#### Student Solutions
|
|
|
|
|
|
Solutions submitted by students are represented by the `Submission` entity. It
|
|
|
contains data such as when and by whom was the solution submitted. There is also
|
|
|
a timestamp, a note for the supervisor and an url of the location where
|
|
|
evaluation results should be stored.
|
|
|
|
|
|
However, the most important part of a submission are the source files. These are
|
|
|
stored using the `SolutionFile` entity and they can be accessed through the
|
|
|
`Solution` entity, which is associated with `Submission`.
|
|
|
|
|
|
When the evaluation is finished, the results are stored using the
|
|
|
`SolutionEvaluation` entity. This entity can have multiple `TestResult` entities
|
|
|
associated with it, which describe the result of a test and also contain
|
|
|
additional information for failing tests (such as which limits were exceeded).
|
|
|
Every `TestResult` can contain multiple `TaskResult` entities that provide
|
|
|
details about the results of individual tasks. This reflects the fact that
|
|
|
"tests" are just logical groups of tasks.
|
|
|
|
|
|
#### Comment Threads
|
|
|
|
|
|
The `Comment` entity contains the author of the comment, a date and the text of
|
|
|
the comment. In addition to this, there is a `CommentThread` entity associated
|
|
|
with it that groups comments on a single entity (such as a student submission).
|
|
|
This enables easily adding support for comments to various entities -- it is
|
|
|
enough to add an association with the `CommentThread` entity. An even simpler
|
|
|
way is to just use the identifier of the commented entity as the identifier of
|
|
|
the comment thread, which is how submission comments are implemented.
|
|
|
|
|
|
#### Uploaded Files
|
|
|
|
|
|
Uploaded files are stored directly on the filesystem instead of in the database.
|
|
|
The `UploadedFile` entity is used to store their metadata. This entity is
|
|
|
extended by `SolutionFile` and `ExerciseFile` using the Single Table Inheritance
|
|
|
pattern provided by Doctrine. Thanks to this, we can access all files uploaded
|
|
|
on the API through the same repository while also having data related to e.g.,
|
|
|
supplementary exercise files present only in related objects.
|
|
|
|
|
|
### Request Handling
|
|
|
|
|
|
A typical scenario for handling an API request is matching the HTTP request with
|
|
|
a corresponding handler routine which creates a response object, that is then
|
|
|
sent back to the client, encoded with JSON. The `Nette\Application` package can
|
|
|
be used to achieve this with Nette, although it is meant to be used mainly in
|
|
|
MVP applications.
|
|
|
|
|
|
Matching HTTP requests with handlers can be done using standard Nette URL
|
|
|
routing -- we will create a Nette route for each API endpoint. Using the routing
|
|
|
mechanism from Nette logically leads to implementing handler routines as Nette
|
|
|
Presenter actions. Each presenter should serve logically related endpoints.
|
|
|
|
|
|
The last step is encoding the response as JSON. In `Nette\Application`, HTTP
|
|
|
responses are returned using the `Presenter::sendResponse()` method. We decided
|
|
|
to write a method that calls `sendResponse` internally and takes care of the
|
|
|
encoding. This method has to be called in every presenter action. An alternative
|
|
|
approach would be using the internal payload object of the presenter, which is
|
|
|
more convenient, but provides us with less control.
|
|
|
|
|
|
### Authentication
|
|
|
|
|
|
Instead of relying on PHP sessions, we decided to use an authentication flow
|
|
|
based on JWT tokens (RFC 7519). On successful login, the user is issued an
|
|
|
access token that they have to send with subsequent requests using the HTTP
|
|
|
Authorization header (Authorization: Bearer <token>). The token has a limited
|
|
|
validity period and has to be renewed periodically using a dedicated API
|
|
|
endpoint.
|
|
|
|
|
|
To implement this behavior in Nette framework, a new IUserStorage implementation
|
|
|
was created (`App\Security\UserStorage`), along with an IIdentity and
|
|
|
authenticators for both our internal login service and CAS. The authenticators
|
|
|
are not registered in the DI container, they are invoked directly instead. On
|
|
|
successful authentication, the returned `App\Security\Identity` object is stored
|
|
|
using the `Nette\Security\User::login()` method. The user storage service works
|
|
|
with the http request to extract the access token if possible.
|
|
|
|
|
|
The logic of issuing tokens is contained in the `App\Security\AccessManager`
|
|
|
class. Internally, it uses the Firebase JWT library.
|
|
|
|
|
|
The authentication flow is contained in the `LoginPresenter` class, which serves
|
|
|
the `/login` endpoint group.
|
|
|
|
|
|
An advantage of this approach is being able control the authentication process
|
|
|
completely instead of just receiving session data through a global variable.
|
|
|
|
|
|
### Accessing Endpoints
|
|
|
|
|
|
The REST API has a [generated documentation](https://recodex.github.io/api/)
|
|
|
describing detailed format of input values as well as response structures
|
|
|
including samples.
|
|
|
|
|
|
Knowing the exact format of the endpoints allows interacting directly with the
|
|
|
API using any REST client available, for example `curl` or `Postman` Chrome
|
|
|
extension. However, there is a generated [REST
|
|
|
client](https://recodex.github.io/api/ui.html) directly for the ReCodEx API
|
|
|
structure using Swagger UI tool. For each endpoint there is a form with boxes
|
|
|
for all the input parameters including description and data type. The responses
|
|
|
are shown as highlighted JSON. The authorization can be set for whole session at
|
|
|
once using "Authorize" button at the top of the page.
|
|
|
|
|
|
### Permissions
|
|
|
|
|
|
In a system storing user data has to be implemented some kind of permission
|
|
|
checking. Each user has a role, which corresponds to his/her privileges.
|
|
|
Our research showed, that three roles are sufficient -- student, supervisor
|
|
|
and administrator. The user role has to be
|
|
|
checked with every request. The good points is, that roles nicely match with
|
|
|
granularity of API endpoints, so the permission checking can be done at the
|
|
|
beginning of each request. That is implemented using PHP annotations, which
|
|
|
allows to specify allowed user roles for each request with very little of code,
|
|
|
but all the business logic is the same, together in one place.
|
|
|
|
|
|
However, roles cannot cover all cases. For example, if user is a supervisor, it
|
|
|
relates only to groups, where he/she is a supervisor. But using only roles
|
|
|
allows him/her to act as supervisor in all groups in the system. Unfortunately,
|
|
|
this cannot be easily fixed using some annotations, because there are many
|
|
|
different cases when this problem occurs. To fix that, some additional checks
|
|
|
can be performed at the beginning of request processing. Usually it is only one
|
|
|
or two simple conditions.
|
|
|
|
|
|
With this two concepts together it is possible to easily cover all cases of
|
|
|
permission checking with quite a small amount of code.
|
|
|
|
|
|
### Uploading Files
|
|
|
|
|
|
There are two cases when users need to upload files using the API -- submitting
|
|
|
solutions to an assignment and creating a new exercise. In both of these cases,
|
|
|
the final destination of the files is the fileserver. However, the fileserver is
|
|
|
not publicly accessible, so the files have to be uploaded through the API.
|
|
|
|
|
|
Each file is uploaded separately and is given a unique ID. The uploaded file
|
|
|
can then be attached to an exercise or a submitted solution of an exercise.
|
|
|
Storing and removing files from the server is done through the
|
|
|
`App\Helpers\UploadedFileStorage` class which maps the files to their records
|
|
|
in the database using the `App\Model\Entity\UploadedFile` entity.
|
|
|
|
|
|
### Forgotten Password
|
|
|
|
|
|
When user finds out that he/she does not remember a password, he/she requests a
|
|
|
password reset and fills in his/her unique email. A temporary access token is
|
|
|
generated for the user corresponding to the given email address and sent to this
|
|
|
address encoded in a URL leading to a client application. User then goes
|
|
|
to the URL and can choose a new password.
|
|
|
|
|
|
The temporary token is generated and emailed by the
|
|
|
`App\Helpers\ForgottenPasswordHelper` class which is registered as a service
|
|
|
and can be injected into any presenter.
|
|
|
|
|
|
This solution is quite safe and user can handle it on its own, so administrator
|
|
|
does not have to worry about it.
|
|
|
|
|
|
### Job Configuration Parsing and Modifying
|
|
|
|
|
|
Even in the API the job configuration file can be loaded in the corresponding
|
|
|
internal structures. This is necessary because there has to be possibility to
|
|
|
modify particular job details, such as the job identification or the fileserver
|
|
|
address, during the submission.
|
|
|
|
|
|
Whole codebase concerning the job configuration is present in the
|
|
|
`App\Helpers\JobConfig` namespace. Job configuration is represented by the
|
|
|
`JobConfig` class which directly contains structures like `SubmissionHeader` or
|
|
|
`Tasks\Task` and indirectly `SandboxConfig` or `JobId` and more. All these
|
|
|
classes have parameterless constructor which should set all values to their
|
|
|
defaults or construct appropriate classes.
|
|
|
|
|
|
Modifying of values in the configuration classes is possible through *fluent
|
|
|
interfaces* and *setters*. Getting of values is also possible and all setters
|
|
|
should have *get* counterparts. Job configuration is serialized through
|
|
|
`__toString()` methods.
|
|
|
|
|
|
For loading of the job configuration there is separate `Storage` class which can
|
|
|
be used for loading, saving or archiving of job configuration. For parsing the
|
|
|
storage uses the `Loader` class which does all the checks and loads the data
|
|
|
from given strings in the appropriate structures. In case of parser error
|
|
|
`App\Exceptions\JobConfigLoadingException` is thrown.
|
|
|
|
|
|
Worth mentioning is also `App\Helpers\UploadedJobConfigStorage` class which
|
|
|
takes care of where the uploaded job configuration files should be saved on the
|
|
|
API filesystem. It can also be used for copying all job configurations during
|
|
|
assignment of exercise.
|
|
|
|
|
|
### Solution Loading
|
|
|
|
|
|
When a solution evaluation is finished by the backend, the results are saved to
|
|
|
the fileserver and the API is notified by the broker. The results are parsed and
|
|
|
stored in the database.
|
|
|
|
|
|
For the results of the evaluations of the reference solutions and for the
|
|
|
asynchronously evaluated solutions of the students (e.g., resubmitted by
|
|
|
the administrator) the result is processed
|
|
|
right after the notification from backend is received and the author of the
|
|
|
solution will be notified by an email after the results are processed.
|
|
|
|
|
|
When a student submits his/her solution directly through the client application
|
|
|
we do not parse the results right away but we postpone this until the student
|
|
|
(or a supervisor) wants to display the results for the first time. This may save
|
|
|
save some resources when the solution results are not important (e.g., the
|
|
|
student finds a bug in his solution before the submission has been evaluated).
|
|
|
|
|
|
#### Parsing of The Results
|
|
|
|
|
|
The results are stored in a YAML file. We map the contents of the file to the
|
|
|
classes of the `App\Helpers\EvaluationResults` namespace. This process
|
|
|
validates the file and gives us access to all of the information through
|
|
|
an interface of a class and not only using associative arrays. This is very
|
|
|
similar to how the job configuration files are processed.
|
|
|
|
|
|
|
|
|
## Web Application
|
|
|
|
|
|
The whole project is written using the next generation of JavaScript referred to
|
|
|
as *ECMAScript 6* (also known as *ES6*, *ES.next*, or *Harmony*). Since not all
|
|
|
of the features introduced in this standard are implemented in the modern
|
|
|
web browsers of today (like classes and the spread operator) and hardly any are
|
|
|
implemented in the older versions of the web browsers which are currently still
|
|
|
in use, the source code is transpiled into the older standard *ES5* using
|
|
|
[Babel.js](https://babeljs.io/) transpiler and bundled into a single script file
|
|
|
using the [webpack](https://webpack.github.io/) moudle bundler. The need for a
|
|
|
transpiler also arises from the usage of the *JSX* syntax for declaring React
|
|
|
components. To read more about these these tools and their usage please refer to
|
|
|
the [installation documentation](#Installation). The whole bundling process
|
|
|
takes place at deployment and is not repeated afterwards when running in
|
|
|
production.
|
|
|
|
|
|
### State Management
|
|
|
|
|
|
Web application is a SPA (Single Page Application. When the user accesses the
|
|
|
page, the source codes are downloaded and are interpreted by the web browser.
|
|
|
The communication between the browser and the server then runs in the background
|
|
|
without reloading the page.
|
|
|
|
|
|
The application keeps its internal state which can be altered by the actions of
|
|
|
the user (e.g., clicking on links and buttons, filling input fields of forms)
|
|
|
and by the outcomes of HTTP requests to the API server. This internal state is
|
|
|
kept in memory of the web browser and is not persisted in any way -- when the
|
|
|
page is refreshed, the internal state is deleted and a new one is created from
|
|
|
scratch (i.e., all of the data is fetched from the API server again).
|
|
|
|
|
|
The only part of the state which is persisted is the token of the logged in
|
|
|
user. This token is kept in cookies and in the local storage. Keeping the token
|
|
|
in the cookies is necessary for server-side rendering.
|
|
|
|
|
|
#### Redux
|
|
|
|
|
|
The in-memory state is handled by the *redux* library. This library is strongly
|
|
|
inspired by the [Flux](https://facebook.github.io/flux/) architecture but it has
|
|
|
some specifics. The whole state is in a single serializable tree structure
|
|
|
called the *store*. This store can be modified only by dispatching *actions*
|
|
|
which are Plain Old JavaScript Oblects (POJO) which are processed by *reducers*.
|
|
|
A reducer is a pure function which takes the state object and the action object
|
|
|
and it creates a new state. This process is very easy to reason about and is
|
|
|
also very easy to test using unit tests. Please read the [redux
|
|
|
documentation](http://redux.js.org/) for detailed information about the library.
|
|
|
|
|
|
![Redux state handling schema](https://github.com/ReCodEx/wiki/raw/master/images/redux.png)
|
|
|
|
|
|
The main difference between *Flux* and *redux* is the fact that there is only
|
|
|
one store with one reducer in redux. The single reducer might be composed from
|
|
|
several silmple reducers which might be composed from other simple reducers as
|
|
|
well, therefore the single reducer of the store is often refered to as the root
|
|
|
reducer. Each of the simple reducers receives all the dispatched actions and it
|
|
|
decide which actions it will process and which it will ignore based on the
|
|
|
*type* of the action. The simple reducers can change only a specific subtree of
|
|
|
the whole state tree and these subtrees do not overlap.
|
|
|
|
|
|
##### Redux Middleware
|
|
|
|
|
|
A middleware in redux is a function which can process actions before they are
|
|
|
passed to the reducers to update the state.
|
|
|
|
|
|
The middleware used by the ReCodEx store is defined in the `src/redux/store.js`
|
|
|
script. Several open source libraries are used:
|
|
|
|
|
|
- [redux-promise-middleware](https://github.com/pburtchaell/redux-promise-middleware)
|
|
|
- [redux-thunk](https://github.com/gaearon/redux-thunk)
|
|
|
- [react-router-redux](https://github.com/reactjs/react-router-redux)
|
|
|
|
|
|
We created two other custom middleware functions for our needs:
|
|
|
|
|
|
- **API middleware** -- The middleware filters out all actions with the *type*
|
|
|
set to `recodex-api/CALL`, sends a real HTTP request according to the
|
|
|
information in the action.
|
|
|
- **Access Token Middleware** -- This middleware persists the access token each
|
|
|
time after the user signs into the application into the local storage and the
|
|
|
cookies. The token is removed when the user decides to sign out. The
|
|
|
middleware also attaches the token to each `recodex-api/CALL` action when it
|
|
|
does not have an access token set explicitly.
|
|
|
|
|
|
##### Accessing The Store Using Selectors
|
|
|
|
|
|
The components of the application are connected to the redux store using a
|
|
|
higher order function `connect` from the *react-redux* binding library. This
|
|
|
connection ensures that the react components will re-render every time some of
|
|
|
the specified subtrees of the main state changes.
|
|
|
|
|
|
The specific subtrees of interest are defined for every connection. These
|
|
|
definitions called *selectors* and they are are simple pure functions which take
|
|
|
the state and return its subtree. To avoid unnecesary re-renders and selections
|
|
|
a small library called [reselect](https://github.com/reactjs/reselect) is used.
|
|
|
This library allows us to compose the selectors in a similar way the reducers
|
|
|
are composed and therefore simply reflect the structure of the whole state
|
|
|
tree. The selectors for each reducer are stored in a separate file in the
|
|
|
`src/redux/selectors` directory.
|
|
|
|
|
|
#### Routing
|
|
|
|
|
|
The page should not be reloaded after the initial render but the current
|
|
|
location of the user in the system must be reflected in the URL. This is
|
|
|
achieved through the
|
|
|
[react-router](https://github.com/ReactTraining/react-router) and
|
|
|
[react-router-redux](https://github.com/reactjs/react-router-redux) libraries.
|
|
|
These libraries use `pushState` method of the `history` object, a living
|
|
|
standard supported by all of the modern browsers. The mapping of the URLs to
|
|
|
the components is defined in the `src/pages/routes.js` file. To create links
|
|
|
between pages, use either the `Link` component from the `react-router` library
|
|
|
or dispatch an action created using the `push` action creator from the
|
|
|
`react-router-redux` library. All the navigations are mapped to redux actions
|
|
|
and can be handled by any reducer.
|
|
|
|
|
|
Having up-to-date URLs gives the users the possibility to reload the page if
|
|
|
some error occurs on the page and land at the same page as he or she would
|
|
|
expect. Users can also send links to the very page they want to.
|
|
|
|
|
|
### Creating HTTP Requests
|
|
|
|
|
|
All of the HTTP requests are made by dispatching a specific action which will be
|
|
|
processed by our custom *API middleware*. The action must have the *type*
|
|
|
property set to `recodex-api/CALL`. The middleware catches the action and it
|
|
|
sends a real HTTP request created according to the information in the `request`
|
|
|
property of the action:
|
|
|
|
|
|
- **type** -- Type prefix of the actions which will be dispatched automatically
|
|
|
during the lifecycle of the request (pending, fulfilled, failed).
|
|
|
- **endpoint** -- The URI to which the request should be sent. All endpoints
|
|
|
will be prefixed with the base URL of the API server.
|
|
|
- **method** (*optional*) -- A string containing the name of the HTTP method
|
|
|
which should be used. The default method is `GET`.
|
|
|
- **query** (*optional*) -- An object containing key-value pairs which will be
|
|
|
put in the URL of the request in the query part of the URL.
|
|
|
- **headers** (*optional*) -- An object containing key-value pairs which will be
|
|
|
appended to the headers of the HTTP request.
|
|
|
- **accessToken** (*optional*) -- Explicitly set the access token for the
|
|
|
request. The token will be put in the *Authorization* header.
|
|
|
- **body** (*optional*) -- An object or an array which will be recursively
|
|
|
flattened into the `FormData` structure with correct usage of square brackets
|
|
|
for nested (associative) arrays. It is worth mentioning that the keys must not
|
|
|
contain a colon in the string.
|
|
|
- **doNotProcess** (*optional*) -- A boolean value which can disable the default
|
|
|
processing of the response to the request which includes showing a
|
|
|
notification to the user in case of a failure of the request. All requests
|
|
|
are processed in the way described above by default.
|
|
|
|
|
|
The HTTP requests are sent using the `fetch` API which returns a *Promise* of
|
|
|
the request. This promise is put into a new action and creates a new action
|
|
|
containing the promise and the type specified in the `request` description. This
|
|
|
action is then caught by the promise middleware and the promise middleware
|
|
|
dispatches actions whenever the state of the promise changes during its the
|
|
|
lifecycle. The new actions have specific types:
|
|
|
|
|
|
- {$TYPE}_PENDING -- Dispatched immediately after the action is processed by
|
|
|
the promise middleware. The `payload` property of the action contains the body
|
|
|
of the request.
|
|
|
- {$TYPE}_FAILED -- Dispatched if the promise of the request is rejected.
|
|
|
- {$TYPE}_FULFILLED -- Dispatched when the response to the request is received
|
|
|
and the promise is resolved. The `payload` property of the action contains the
|
|
|
body of the HTTP response parsed as JSON.
|
|
|
|
|
|
### Routine CRUD Operations
|
|
|
|
|
|
For routine CRUD (Create, Read, Update, Delete) operations which are common to
|
|
|
most of the resources used in the ReCodEx (e.g., groups, users, assignments,
|
|
|
solutions, solution evaluations, source code files) a set of functions called
|
|
|
*Resource manager* was implemented. It contains a factory which creates basic
|
|
|
actions (e.g., `fetchResource`, `addResource`, `updateResource`,
|
|
|
`removeResource`, `fetchMany`) and handlers for all of the lifecycle actions
|
|
|
created by both the API middleware and the promise middleware which can be used
|
|
|
to create a basic reducer.
|
|
|
|
|
|
The *resource manager* is spread over several files in the
|
|
|
`src/redux/helpers/resourceManager` directory and is covered with unit tests in
|
|
|
scripts located at `test/redux/helpers/resourceManager`.
|
|
|
|
|
|
### Server-side Rendering
|
|
|
|
|
|
To speed-up the initial time of rendering of the web application a technique
|
|
|
called server-side rendering (SSR) is used. The same code which is executed in
|
|
|
the web browser of the client can run on the server using
|
|
|
[Node.js](https://nodejs.org). React can serialize its HTML output into a string
|
|
|
which can be sent to the client and can be displayed before the (potentially
|
|
|
large) JavaScript source code starts being executed by the browser. The redux
|
|
|
store is in fact just a large JSON tree which can be easily serialized as well.
|
|
|
|
|
|
If the user is logged in then the access token should be in the cookies of the
|
|
|
web browser and it should be attached to the HTTP request when the user
|
|
|
navigates to the ReCodEx web page. This token is then put into the redux store
|
|
|
and so the user is logged in on the server.
|
|
|
|
|
|
The whole logic of the SSR is in a single file called `src/server.js`. It
|
|
|
contains only a definition of a simple HTTP server (using the
|
|
|
[express](http://expressjs.com/) framework) and some necessary boilerplate of
|
|
|
the routing library.
|
|
|
|
|
|
All the components which are associated to the matched route can have a class
|
|
|
property `loadAsync` which should contain a function returning a *Promise*. The
|
|
|
SRR calls all these functions and delays the response of the HTTP server until
|
|
|
all of the promises are resolved (or some of them fails).
|
|
|
|
|
|
### Localization and globalization
|
|
|
|
|
|
The whole application is prepared for localization and globalization. All of the
|
|
|
translatable texts can be extracted from the user interface and translated
|
|
|
into several languages. The numbers, dates, and time values are also formatted
|
|
|
with respect to the selected language. The
|
|
|
[react-intl](https://github.com/yahoo/react-intl) and
|
|
|
[Moment.js](http://momentjs.com/) libraries are used to achieve this.
|
|
|
All the strings can be extracted from the application using a command:
|
|
|
|
|
|
```
|
|
|
$ npm run exportStrings
|
|
|
```
|
|
|
|
|
|
This will create JSON files with the exported strings for the 'en' and 'cs'
|
|
|
locale. If you want to export strings for more languages, you must edit the
|
|
|
`/manageTranslations.js` script. The exported strings are placed in the
|
|
|
`/src/locales` directory.
|
|
|
|
|
|
## Communication Protocol
|
|
|
|
|
|
Detailed communication inside the ReCodEx system is captured in the following
|
|
|
image and described in sections below. Red connections are through ZeroMQ
|
|
|
sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ
|
|
|
messages are sent as multipart with one string (command, option) per part, with
|
|
|
no empty frames (unless explicitly specified otherwise).
|
|
|
|
|
|
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
|
|
|
|
|
|
|
|
|
### Broker - Worker Communication
|
|
|
|
|
|
Broker acts as server when communicating with worker. Listening IP address and
|
|
|
port are configurable, protocol family is TCP. Worker socket is of DEALER type,
|
|
|
broker one is ROUTER type. Because of that, very first part of every (multipart)
|
|
|
message from broker to worker must be target the socket identity of the worker
|
|
|
(which is saved on its **init** command).
|
|
|
|
|
|
#### Commands from Broker to Worker:
|
|
|
|
|
|
- **eval** -- evaluate a job. Requires 3 message frames:
|
|
|
- `job_id` -- identifier of the job (in ASCII representation -- we avoid
|
|
|
endianness issues and also support alphabetic ids)
|
|
|
- `job_url` -- URL of the archive with job configuration and submitted
|
|
|
source code
|
|
|
- `result_url` -- URL where the results should be stored after evaluation
|
|
|
- **intro** -- introduce yourself to the broker (with **init** command) -- this
|
|
|
is required when the broker loses track of the worker who sent the command.
|
|
|
Possible reasons for such event are e.g. that one of the communicating sides
|
|
|
shut down and restarted without the other side noticing.
|
|
|
- **pong** -- reply to **ping** command, no arguments
|
|
|
|
|
|
#### Commands from Worker to Broker:
|
|
|
|
|
|
- **init** -- introduce self to the broker. Useful on startup or after
|
|
|
reestablishing lost connection. Requires at least 2 arguments:
|
|
|
- `hwgroup` -- hardware group of this worker
|
|
|
- `header` -- additional header describing worker capabilities. Format must
|
|
|
be `header_name=value`, every header shall be in a separate message frame.
|
|
|
There is no limit on number of headers. There is also an optional third
|
|
|
argument -- additional information. If present, it should be separated
|
|
|
from the headers with an empty frame. The format is the same as headers.
|
|
|
Supported keys for additional information are:
|
|
|
- `description` -- a human readable description of the worker for
|
|
|
administrators (it will show up in broker logs)
|
|
|
- `current_job` -- an identifier of a job the worker is now processing. This
|
|
|
is useful when we are reassembling a connection to the broker and need it
|
|
|
to know the worker will not accept a new job.
|
|
|
- **done** -- notifying of finished job. Contains following message frames:
|
|
|
- `job_id` -- identifier of finished job
|
|
|
- `result` -- response result, possible values are:
|
|
|
- OK -- evaluation finished successfully
|
|
|
- FAILED -- job failed and cannot be reassigned to another worker (e.g.
|
|
|
due to error in configuration)
|
|
|
- INTERNAL_ERROR -- job failed due to internal worker error, but another
|
|
|
worker might be able to process it (e.g. downloading a file failed)
|
|
|
- `message` -- a human readable error message
|
|
|
- **progress** -- notice about current evaluation progress. Contains following
|
|
|
message frames:
|
|
|
- `job_id` -- identifier of current job
|
|
|
- `command` -- what is happening now.
|
|
|
- DOWNLOADED -- submission successfully fetched from fileserver
|
|
|
- FAILED -- something bad happened and job was not executed at all
|
|
|
- UPLOADED -- results are uploaded to fileserver
|
|
|
- STARTED -- evaluation of tasks started
|
|
|
- ENDED -- evaluation of tasks is finished
|
|
|
- ABORTED -- evaluation of job encountered internal error, job will be
|
|
|
rescheduled to another worker
|
|
|
- FINISHED -- whole execution is finished and worker ready for another
|
|
|
job execution
|
|
|
- TASK -- task state changed -- see below
|
|
|
- `task_id` -- only present for "TASK" state -- identifier of task in
|
|
|
current job
|
|
|
- `task_state` -- only present for "TASK" state -- result of task
|
|
|
evaluation. One of:
|
|
|
- COMPLETED -- task was successfully executed without any error,
|
|
|
subsequent task will be executed
|
|
|
- FAILED -- task ended up with some error, subsequent task will be
|
|
|
skipped
|
|
|
- SKIPPED -- some of the previous dependencies failed to execute, so
|
|
|
this task will not be executed at all
|
|
|
- **ping** -- tell broker I am alive, no arguments
|
|
|
|
|
|
|
|
|
#### Heartbeating
|
|
|
|
|
|
It is important for the broker and workers to know if the other side is still
|
|
|
working (and connected). This is achieved with a simple heartbeating protocol.
|
|
|
|
|
|
The protocol requires the workers to send a **ping** command regularly (the
|
|
|
interval is configurable on both sides -- future releases might let the worker
|
|
|
send its ping interval with the **init** command). Upon receiving a **ping**
|
|
|
command, the broker responds with **pong**.
|
|
|
|
|
|
Whenever a heartbeating message doesn't arrive, a counter called _liveness_ is
|
|
|
decreased. When this counter drops to zero, the other side is considered
|
|
|
disconnected. When a message arrives, the liveness counter is set back to its
|
|
|
maximum value, which is configurable for both sides.
|
|
|
|
|
|
When the broker decides a worker disconnected, it tries to reschedule its jobs
|
|
|
to other workers.
|
|
|
|
|
|
If a worker thinks the broker crashed, it tries to reconnect periodically, with
|
|
|
a bounded, exponentially increasing delay.
|
|
|
|
|
|
This protocol proved great robustness in real world testing. Thus whole backend
|
|
|
is reliable and can outlive short term issues with connection without problems.
|
|
|
Also, increasing delay of ping messages does not flood the network when there
|
|
|
are problems. We experienced no issues since we are using this protocol.
|
|
|
|
|
|
### Worker - Fileserver Communication
|
|
|
|
|
|
Worker is communicating with file server only from _execution thread_. Supported
|
|
|
protocol is HTTP optionally with SSL encryption (**recommended**). If supported
|
|
|
by server and used version of libcurl, HTTP/2 standard is also available. File
|
|
|
server should be set up to require basic HTTP authentication and worker is
|
|
|
capable to send corresponding credentials with each request.
|
|
|
|
|
|
#### Worker Side
|
|
|
|
|
|
Workers communicate with the file server in both directions -- they download the
|
|
|
submissions of the student and then upload evaluation results. Internally,
|
|
|
worker is using libcurl C library with very similar setup. In both cases it can
|
|
|
verify HTTPS certificate (on Linux against system cert list, on Windows against
|
|
|
downloaded one from CURL website during installation), support basic HTTP
|
|
|
authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error
|
|
|
(returned HTTP status code is >=400). Worker have list of credentials to all
|
|
|
available file servers in its config file.
|
|
|
|
|
|
- download file -- standard HTTP GET request to given URL expecting file content
|
|
|
as response
|
|
|
- upload file -- standard HTTP PUT request to given URL with file data as body
|
|
|
-- same as command line tool `curl` with option `--upload-file`
|
|
|
|
|
|
#### File server side
|
|
|
|
|
|
File server has its own internal directory structure, where all the files are
|
|
|
stored. It provides simple REST API to get them or create new ones. File server
|
|
|
does not provide authentication or secured connection by itself, but it is
|
|
|
supposed to run file server as WSGI script inside a web server (like Apache)
|
|
|
with proper configuration. Relevant commands for communication with workers:
|
|
|
|
|
|
- **GET /submission_archives/\<id\>.\<ext\>** -- gets an archive with submitted
|
|
|
source code and corresponding configuration of this job evaluation
|
|
|
- **GET /exercises/\<hash\>** -- gets a file, common usage is for input files or
|
|
|
reference result files
|
|
|
- **PUT /results/\<id\>.\<ext\>** -- upload archive with evaluation results
|
|
|
under specified name (should be same _id_ as name of submission archive). On
|
|
|
successful upload returns JSON `{ "result": "OK" }` as body of returned page.
|
|
|
|
|
|
If not specified otherwise, `zip` format of archives is used. Symbol `/` in API
|
|
|
description is root of the domain of the file server. If the domain is for
|
|
|
example `fs.recodex.org` with SSL support, getting input file for one task could
|
|
|
look as GET request to
|
|
|
`https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c`.
|
|
|
|
|
|
|
|
|
### Broker - Monitor Communication
|
|
|
|
|
|
Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
|
|
|
socket is same on both sides, ROUTER. Monitor is set to act as server in this
|
|
|
communication, its IP address and port are configurable in the config of the
|
|
|
monitor file. ZeroMQ socket ID (set on the side of the monitor) is
|
|
|
"recodex-monitor" and must be sent as first frame of every multipart message --
|
|
|
see ZeroMQ ROUTER socket documentation for more info.
|
|
|
|
|
|
Note that the monitor is designed so that it can receive data both from the
|
|
|
broker and workers. The current architecture prefers the broker to do all the
|
|
|
communication so that the workers do not have to know too many network services.
|
|
|
|
|
|
Monitor is treated as a somewhat optional part of whole solution, so no special
|
|
|
effort on communication reliability was made.
|
|
|
|
|
|
#### Commands from Monitor to Broker:
|
|
|
|
|
|
Because there is no need for the monitor to communicate with the broker, there
|
|
|
are no commands so far. Any message from monitor to broker is logged and
|
|
|
discarded.
|
|
|
|
|
|
#### Commands from Broker to Monitor:
|
|
|
|
|
|
- **progress** -- notification about progress with job evaluation. This
|
|
|
communication is usually redirected as is from worker, more info can be found
|
|
|
in "Broker - Worker Communication" chapter above.
|
|
|
|
|
|
|
|
|
### Broker - REST API Communication
|
|
|
|
|
|
Broker communicates with main REST API through ZeroMQ connection over TCP.
|
|
|
Socket type on broker side is ROUTER, on frontend part it is DEALER. Broker acts
|
|
|
as a server, its IP address and port is configurable in the API.
|
|
|
|
|
|
#### Commands from API to Broker:
|
|
|
|
|
|
- **eval** -- evaluate a job. Requires at least 4 frames:
|
|
|
- `job_id` -- identifier of this job (in ASCII representation -- we avoid
|
|
|
endianness issues and also support alphabetic ids)
|
|
|
- `header` -- additional header describing worker capabilities. Format must
|
|
|
be `header_name=value`, every header shall be in a separate message frame.
|
|
|
There is no maximum limit on number of headers. There may be also no
|
|
|
headers at all. A worker is considered suitable for the job if and only if
|
|
|
it satisfies all of its headers.
|
|
|
- empty frame -- frame which contains only empty string and serves only as
|
|
|
breakpoint after headers
|
|
|
- `job_url` -- URI location of archive with job configuration and submitted
|
|
|
source code
|
|
|
- `result_url` -- remote URI where results will be pushed to
|
|
|
|
|
|
#### Commands from Broker to API:
|
|
|
|
|
|
All are responses to **eval** command.
|
|
|
|
|
|
- **ack** -- this is first message which is sent back to frontend right after
|
|
|
eval command arrives, basically it means "Hi, I am all right and am capable of
|
|
|
receiving job requests", after sending this broker will try to find acceptable
|
|
|
worker for arrived request
|
|
|
- **accept** -- broker is capable of routing request to a worker
|
|
|
- **reject** -- broker cannot handle this job (for example when the requirements
|
|
|
specified by the headers cannot be met). There are (rare) cases when the
|
|
|
broker finds that it cannot handle the job after it was confirmed. In such
|
|
|
cases it uses the frontend REST API to mark the job as failed.
|
|
|
|
|
|
#### Asynchronous Communication Between Broker And API
|
|
|
|
|
|
Only a fraction of the errors that can happen during evaluation can be detected
|
|
|
while there is a ZeroMQ connection between the API and broker. To notify the
|
|
|
frontend of the rest, the API exposes an endpoint for the broker for this
|
|
|
purpose. Broker uses this endpoint whenever the status of a job changes
|
|
|
(it is finished, it failed permanently, the only worker capable of processing
|
|
|
it disconnected...).
|
|
|
|
|
|
When a request for sending a report arrives from the backend then the type of
|
|
|
the report is inferred and if it is an error which deserves attention of the
|
|
|
administrator then an email is sent to him/her. There can also be errors which
|
|
|
are not that important (e.g., it was somehow solved by the backend itself or it
|
|
|
is only informative), then these do not have to be reported through an email but
|
|
|
they are stored in the persistent database for further consideration.
|
|
|
|
|
|
For the details of this interface please refer to the attached API documentation
|
|
|
and the `broker-reports/` endpoint group.
|
|
|
|
|
|
### Fileserver - REST API Communication
|
|
|
|
|
|
File server has a REST API for interaction with other parts of ReCodEx.
|
|
|
Description of communication with workers is in "Worker - Fileserver
|
|
|
Communication" chapter above. On top of that, there are other commands for
|
|
|
interaction with the API:
|
|
|
|
|
|
- **GET /results/\<id\>.\<ext\>** -- download archive with evaluated results of
|
|
|
job _id_
|
|
|
- **POST /submissions/\<id\>** -- upload new submission with identifier _id_.
|
|
|
Expects that the body of the POST request uses file paths as keys and the
|
|
|
content of the files as values. On successful upload returns JSON `{
|
|
|
"archive_path": <archive_url>, "result_path": <result_url> }` in response
|
|
|
body. From _archive_path_ the submission can be downloaded (by worker) and
|
|
|
corresponding evaluation results should be uploaded to _result_path_.
|
|
|
- **POST /tasks** -- upload new files, which will be available by names equal to
|
|
|
`sha1sum` of their content. There can be uploaded more files at once. On
|
|
|
successful upload returns JSON `{ "result": "OK", "files": <file_list> }` in
|
|
|
response body, where _file_list_ is dictionary of original file name as key
|
|
|
and new URL with already hashed name as value.
|
|
|
|
|
|
There are no plans yet to support deleting files from this API. This may change
|
|
|
in time.
|
|
|
|
|
|
REST API calls these fileserver endpoints with standard HTTP requests. There are
|
|
|
no special commands involved. There is no communication in opposite direction.
|
|
|
|
|
|
### Monitor - Web App Communication
|
|
|
|
|
|
Monitor interacts with web application through WebSocket connection. Monitor
|
|
|
acts as server and browsers are connecting to it. IP address and port are
|
|
|
configurable. When client connects to the monitor, it sends a message with
|
|
|
string representation of channel id (which messages are interested in, usually
|
|
|
id of evaluating job). There can be multiple listeners per channel, even
|
|
|
(shortly) delayed connections will receive all messages from the very beginning.
|
|
|
|
|
|
When monitor receives **progress** message from broker there are two options:
|
|
|
|
|
|
- there is no WebSocket connection for listed channel (job id) -- message is
|
|
|
dropped
|
|
|
- there is active WebSocket connection for listed channel -- message is parsed
|
|
|
into JSON format (see below) and send as string to that established channel.
|
|
|
Messages for active connections are queued, so no messages are discarded even
|
|
|
on heavy workload.
|
|
|
|
|
|
Message from monitor to web application is in JSON format and it has form of
|
|
|
dictionary (associative array). Information contained in this message should
|
|
|
correspond with the ones given by worker to broker. For further description
|
|
|
please read more in "Broker - Worker communication" chapter under "progress"
|
|
|
command.
|
|
|
|
|
|
Message format:
|
|
|
|
|
|
- **command** -- type of progress, one of: DOWNLOADED, FAILED, UPLOADED,
|
|
|
STARTED, ENDED, ABORTED, FINISHED, TASK
|
|
|
- **task_id** -- id of currently evaluated task. Present only if **command** is
|
|
|
"TASK".
|
|
|
- **task_state** -- state of task with id **task_id**. Present only if
|
|
|
**command** is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".
|
|
|
|
|
|
### Web App - REST API Communication
|
|
|
|
|
|
The provided web application runs as a JavaScript process inside the browser of
|
|
|
the user. It communicates with the REST API on the server through the standard
|
|
|
HTTP requests. Documentation of the main REST API is in a separate
|
|
|
[document](https://recodex.github.io/api/) due to its extensiveness. The results
|
|
|
are returned encoded in JSON which is simply processed by the web application
|
|
|
and presented to the user in an appropriate way.
|
|
|
|
|
|
|
|
|
<!---
|
|
|
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
|
|
|
-->
|
|
|
|