master
Simon Rozsival 8 years ago
commit fcf0c69a64

@ -70,121 +70,3 @@ forwarded to the frontend. The same goes for external failures. Jobs that fail
internally cannot be reassigned, because the "new" broker does not know their internally cannot be reassigned, because the "new" broker does not know their
headers -- they are reported as failed immediately. headers -- they are reported as failed immediately.
## Installation
### Dependencies
Broker has similar basic dependencies as worker, for recapitulation:
- ZeroMQ in version at least 4.0, packages `zeromq` and `zeromq-devel` (`libzmq3-dev` on Debian)
- YAML-CPP library, `yaml-cpp` and `yaml-cpp-devel` (`libyaml-cpp0.5v5` and `libyaml-cpp-dev` on Debian)
- libcurl library `libcurl-devel` (`libcurl4-gnutls-dev` on Debian)
### Clone broker source code repository
```
$ git clone https://github.com/ReCodEx/broker.git
$ git submodule update --init
```
### Install broker
It is supposed that your current working directory is that one with clonned worker source codes.
- Prepare environment running `mkdir build && cd build`
- Build sources by `cmake ..` following by `make`
- Build binary package by `make package` (may require root permissions).
Note that `rpm` and `deb` packages are build in the same time. You may need to have `rpmbuild` command (usually as `rpmbuild` or `rpm` package) or edit CPACK_GENERATOR variable _CMakeLists.txt_ file in root of source code tree.
- Install generated package through your package manager (`yum`, `dnf`, `dpkg`).
_Note:_ If you do not want to generate binary packages, you can just install the project with `make install` (as root). But installation through your distribution's package manager is preferred way to keep your system clean and manageable in long term horizon.
## Configuration and usage
Following text describes how to set up and run broker program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd now.
Installation of broker program does following step to your computer:
- create config file `/etc/recodex/broker/config.yml`
- create _systemd_ unit file `/etc/systemd/system/recodex-broker.service`
- put main binary to `/usr/bin/recodex-broker`
- create system user and group `recodex` with nologin shell (if not existing)
- create log directory `/var/log/recodex`
- set ownership of config (`/etc/recodex`) and log (`/var/log/recodex`) directories to `recodex` user and group
### Default broker configuration
#### Configuration items
Description of configurable items in broker's config. Mandatory items are bold, optional italic.
- _clients_ -- specifies address and port to bind for clients (frontend instance)
- _address_ -- hostname or IP address as string (`*` for any)
- _port_ -- desired port
- _workers_ -- specifies address and port to bind for workers
- _address_ -- hostname or IP address as string (`*` for any)
- _port_ -- desired port
- _max_liveness_ -- maximum amount of pings the worker can fail to send before it is considered disconnected
- _max_request_failures_ -- maximum number of times a job can fail (due to e.g. worker disconnect or a network error when downloading something from the fileserver) and be assigned again
- _monitor_ -- settings of monitor service connection
- _address_ -- IP address of running monitor service
- _port_ -- desired port
- _notifier_ -- details of connection which is used in case of errors and good to know states
- _address_ -- address where frontend API runs
- _port_ -- desired port
- _username_ -- username which can be used for HTTP authentication
- _password_ -- password which can be used for HTTP authentication
- _logger_ -- settings of logging capabilities
- _file_ -- path to the logging file with name without suffix. `/var/log/recodex/broker` item will produce `broker.log`, `broker.1.log`, ...
- _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug`
- _max-size_ -- maximal size of log file before rotating
- _rotations_ -- number of rotation kept
#### Example config file
```{.yml}
# Address and port for clients (frontend)
clients:
address: "*"
port: 9658 # Address and port for workers
workers:
address: "*"
port: 9657
max_liveness: 10
max_request_failures: 3
monitor:
address: "127.0.0.1"
port: 7894
notifier:
address: "127.0.0.1"
port: 8080
username: ""
password: ""
logger:
file: "/var/log/recodex/broker" # w/o suffix - actual names will be
# broker.log, broker.1.log, ...
level: "debug" # level of logging
max-size: 1048576 # 1 MB; max size of file before log rotation
rotations: 3 # number of rotations kept
```
### Running broker
Running broker is very similar to the worker setup. There is also provided systemd unit file for convenient usage. There is only one broker per whole ReCodEx solution, so there is no need for systemd templates.
- Running broker can be done by following command:
```
# systemctl start recodex-broker.service
```
Check with
```
# systemctl status recodex-broker.service
```
if the broker is running. You should see "active (running)" message.
- Broker can be stopped or restarted accordigly using `systemctl stop` and `systemctl restart` commands.
- If you want to run broker after system startup, run:
```
# systemctl enable recodex-broker.service
```
For further information about using systemd please refer to systemd documentation.

@ -1,122 +0,0 @@
# Coding style
Every project should have some consistent coding style in which all contributors write. Bellow you can find our conventions on which we agreed on and which we try to keep.
## C++
**NOTE, that C++ projects have set code linter (`cmake-format`) with custom format. To reformat code run `make format` inside `build` directory of the project (probably not working on Windows).** For quick introduction into our format, see following paragraphs.
In C++ is written worker and broker. Generally it is used underscore style with all small letters. Inspired by [Google C++ style guide](https://google.github.io/styleguide/cppguide.html). If something is not defined than naming/formatting can be arbitrary, but should be similar to bellow-defined behaviour.
### Naming convention
* For source codes use all lower case with underscores not dashes. Header files should end with `.h` and C++ files with `.cpp`.
* Typenames are all in lower case with underscores between words. This is applicable to classes, structs, typedefs, enums and type template parameters.
* Variable names can be divided on local variables and class members. Local variables are all lower case with underscores between words. Class members have in addition trailing underscore on the end (struct data members do not have underscore on the end).
* Constants are just like any other variables and do not have any specifics.
* All function names are again all lower case with underscores between words.
* Namespaces if there are ones they should have lower case and underscores.
* Macros are classical and should have all capitals and underscores.
* Comments can be two types documentational and ordinery ones in code. Documentation should start with `/**` and end with `*/`, convention inside them is javadoc documentation format. Classical comments in code are one liners which starts with `//` and end with the end of the line.
### Formatting convention
* Line length is not explicitly defined, but should be reasonable.
* All files should use UTF-8 character set.
* For code indentation tabs (`\t`) are used.
* Function declaration/definition: return type should be on the same line as the rest of the declaration, if line is too long, than particular parameters are placed on new line. Opening parenthesis of function should be placed on new line bellow declaration. Its possible to write small function which can be on only one line. Between parameter and comma should be one space.
```
int run(int id, string msg);
void print_hello_world()
{
std::cout << "Hello world" << std::endl;
return;
}
int get_five() { return 5; }
```
* Lambda expressions: same formatting as classical functions
```
auto hello = [](int x) { std::cout << "hello_" << x << std::endl; }
```
* Function calls: basically same as function header definition.
* Condition: after if, or else there always have to be one space in front of opening bracket and again one space after closing condition bracket (and in front of opening parenthesis). If and else always should be on separate lines. Inside condition there should not be any pointless spaces.
```
if (x == 5) {
std::cout << "Exactly five!" << std::endl;
} else if (x < 5 && y > 5) {
std::cout << "Whoa, that is weird format!" << std::endl;
} else {
std::cout << "I dont know what is this!" << std::endl;
}
```
* For and while cycles: basically same rules as for if condition.
* Try-catch blocks: again same rules as for if conditions. Closing parentheses of try block should be on the same line as catch block.
```
try {
int a = 5 / 0;
} catch (...) {
std::cout << "Division by zero" << std::endl;
}
```
* Switch: again basics are the same as for if condition. Case statements should not be indented and case body should be intended with 1 tab.
```
switch (switched) {
case 0: // no tab indent
... // 1 tab indent
break;
case 1:
...
break;
default:
exit(1);
}
```
* Pointers and references: no spaces between period or arrow in accessing type member. No spaces after asterisk or ampersand. In declaration of pointer or reference format should be that asterisk or ampersand is adjacent to name of the variable not type.
```
number = *ptr;
ptr = &val;
number = ptr->number;
number = val_ref.number;
int *i;
int &j;
// bad format bellow
int* i;
int * i;
```
* Boolean expression: long boolean expression should be divided into more lines. The division point should always be after logical operators.
```
if (i > 10 &&
j < 10 &&
k > 20) {
std::cout << "Were here!" << std::endl;
}
```
* Return values should not be generally wrapped with parentheses, only if needed.
* Preprocessor directives start with `#` and always should start at the beginning of the line.
* Classes: sections aka. public, protected, private should have same indentation as the class start itself. Opening parenthesis of class should be on the same line as class name.
```
class my_class {
public:
void class_function();
private:
int class_member_;
};
```
* Operators: around all binary operators there always should be spaces.
```
int x = 5;
x = x * 5 / 5;
x = x + 5 * (10 - 5);
```
## Python
Python code should correspond to [PEP 8](https://www.python.org/dev/peps/pep-0008/) style.
## PHP
TODO:
## JavaScript
TODO:

@ -32,87 +32,4 @@ the following subfolders:
structure). structure).
## Installation
To install and use the fileserver, it is necessary to have Python3 with `pip` package manager installed. It is needed to install the dependencies. From clonned repository run the following command:
```
$ pip install -r requirements.txt
```
That is it. Fileserver does not need any special installation. It is possible to build and install _rpm_ package or install it without packaging the same way as monitor, but it is only optional. The installation would provide you with script `recodex-fileserver` in you `PATH`. No systemd unit files are provided, because of the configuration and usage of fileserver component is much different to our other Python parts.
## Configuration and usage
There are several ways of running the ReCodEx fileserver. We will cover two
typical use cases.
### Running in development mode
For simple development usage, it is possible to run the fileserver in the command
line. Allowed options are described below.
```
usage: fileserver.py [--directory WORKING_DIRECTORY]
{runserver,shell} ...
```
- **runserver** argument starts the Flask development server (i.e. `app.run()`). As additional argument can be given a port number.
- **shell** argument instructs Flask to run a Python shell inside application context.
Simple development server on port 9999 can be run as
```
$ python3 fileserver.py runserver 9999
```
When run like this command, the fileserver creates a temporary directory where it stores all the files and which is deleted when it exits.
### Running as WSGI script in a web server
If you need features such as HTTP authentication (recommended) or efficient serving of static
files, it is recommended to run the app in a full-fledged web server (such as
Apache or Nginx) using WSGI. Apache configuration can be generated by `mkconfig.py` script from the repository.
```
usage: mkconfig.py apache [-h] [--port PORT] --working-directory
WORKING_DIRECTORY [--htpasswd HTPASSWD]
[--user USER]
```
- **port** -- port where the fileserver should listen
- **working_directory** -- directory where the files should be stored
- **htpasswd** -- path to user file for HTTP Basic Authentication
- **user** -- user under which the server should be run
### Running using uWSGI
Another option is to run fileserver as a standalone app via uWSGI service. Setup is also quite simple, configuration file can be also generated by `mkconfig.py` script.
1. (Optional) Create a user for running the fileserver
2. Make sure that your user can access your clone of the repository
3. Run `mkconfig.py` script.
```
usage: mkconfig.py uwsgi [-h] [--user USER] [--port PORT]
[--socket SOCKET]
--working-directory WORKING_DIRECTORY
```
- **user** -- user under which the server should be run
- **port** -- port where the fileserver should listen
- **socket** -- path to UNIX socket where the fileserver should listen
- **working_directory** -- directory where the files should be stored
4. Save the configuration file generated by the script and run it with uWSGI,
either directly or using systemd. This depends heavily on your distribution.
5. To integrate this with another web server, see the [uWSGI
documentation](http://uwsgi-docs.readthedocs.io/en/latest/WebServers.html)
Note that the ways distributions package uWSGI can vary wildly. In Debian 8 it is
necessary to convert the configuration file to XML and make sure that the
python3 plugin is loaded instead of python. This plugin also uses Python 3.4,
even though the rest of the system uses Python 3.5 - make sure to install
dependencies for the correct version.

@ -1,24 +1,59 @@
# Installation # Installation
Installation of whole ReCodEx solution is a very complex process. It is recommended to have good unix skills with basic knowledge of project architecture. Installation of whole ReCodEx solution is a very complex process. It is
recommended to have good unix skills with basic knowledge of project
architecture.
There are a lot of different GNU/Linux distributions with different package management, naming convention and version release policies. So it is impossible to cover all of the possible variants. We picked one distribution, which is fully supported by automatic installation script, for others there are brief information about installation in every project component's own chapter. There are a lot of different GNU/Linux distributions with different package
management, naming convention and version release policies. So it is impossible
to cover all of the possible variants. We picked one distribution, which is
fully supported by automatic installation script, but there are also steps for
manual installation of all components which should work on most of the Linux
distributions.
Distribution of our choice is CentOS, currently in version 7. It is a well known server distribution, derived from enterprise distrubution from Red Hat, so it is very stable and widely used system with long term support. There are [EPEL](https://fedoraproject.org/wiki/EPEL) additional repositories from Fedora project, which adds newer versions of some packages into CentOS, which allows us to use current environment. Also, _rpm_ packages are much easier to build (for example from Python sources) and maintain. The distribution of our choice is CentOS, currently in version 7. It is a well
known server distribution, derived from enterprise distrubution from Red Hat, so
it is very stable and widely used system with long term support. There are
[EPEL](https://fedoraproject.org/wiki/EPEL) additional repositories from Fedora
project, which adds newer versions of some packages into CentOS, which allows us
to use current environment. Also, _rpm_ packages are much easier to build than
_deb_ packages (for example from Python sources).
The big rival of CentOS in server distributions field is Debian. We are running one instance of ReCodEx on Debian too. You need to use _testing_ repositories to use some decent package versions. It is easy to mess your system easily, so create file `/etc/apt/apt.conf` with content of `APT::Default-Release "stable";`. After you add testing repos to `/etc/apt/sources.list`, you can install packages from there like `$ sudo apt-get -t testing install gcc`. The big rival of CentOS in server distributions field is Debian. We are running
one instance of ReCodEx on Debian too. You need to use _testing_ repositories to
use some decent package versions. It is easy to mess your system easily, so
create file `/etc/apt/apt.conf` with content of `APT::Default-Release
"stable";`. After you add testing repos to `/etc/apt/sources.list`, you can
install packages from there like `$ sudo apt-get -t testing install gcc`.
Some components are also capable of running in Windows environment. However setting up Windows OS is a little bit of pain and it is not supposed to run ReCodEx in this way. Only worker component may be needed to run on Windows, so we are providing clickable installer including dependencies. Just for info, all components should be able to run on Windows, only broker was not tested and may require small tweaks to properly work. Some components are also capable of running in Windows environment. However
setting up Windows OS is a little bit of pain and it is not supposed to run
ReCodEx in this way. Only worker component may be needed to run on Windows, so
we are providing clickable installer including dependencies. Just for info, all
components should be able to run on Windows, only broker was not tested and may
require small tweaks to work properly.
## Ansible installer ## Ansible installer
For automatic installation is used set of Ansible scripts. Ansible is one of the best known and used tools for automatic server management. It is required only to have SSH access to the server and ansible installed on the client machine. For further reading is supposed basic Ansible knowledge. For more info check their [documentation](http://docs.ansible.com/ansible/intro.html). For automatic installation is used a set of Ansible scripts. Ansible is one of
the best known and used tools for automatic server management. It is required
only to have SSH access to the server and ansible installed on the client
machine. For further reading is supposed basic Ansible knowledge. For more info
check their [documentation](http://docs.ansible.com/ansible/intro.html).
All Ansible scripts are located in _utils_ repository, _installation_ [directory](https://github.com/ReCodEx/utils/tree/master/installation). Ansible files are pretty self-describing, they can be also use as template for installation to different systems. Before installation itself it is required to edit two files -- set addresses of hosts and values of some variables. All Ansible scripts are located in _utils_ repository, _installation_
[directory](https://github.com/ReCodEx/utils/tree/master/installation). Ansible
files are pretty self-describing, they can be also use as template for
installation to different systems. Before installation itself it is required to
edit two files -- set addresses of hosts and values of some variables.
### Hosts configuration ### Hosts configuration
First, it is needed to set ip addresses of your computers. Common practise is to have multiple files with definitions, one for development, another for production for example. Example configuration is in _development_ file. Each component of ReCodEx project can be installed on different server. Hosts can be specified as hostnames or ip addresses, optionally with port of SSH after colon. First, it is needed to set IP addresses of your computers. Common practise is to
have multiple files with definitions, one for development, another for
production for example. Example configuration is in _development_ file. Each
component of ReCodEx project can be installed on different server. Hosts can be
specified as hostnames or IP addresses, optionally with port of SSH after colon.
Shorten example of hosts config: Shorten example of hosts config:
@ -36,69 +71,748 @@ broker
### Variables ### Variables
Configurable variables are saved in _group_vars/all.yml_ file. Syntax is basic key-value pair per line, separated by colon. Values with brief description: Configurable variables are saved in _group_vars/all.yml_ file. Syntax is basic
key-value pair per line, separated by colon. Values with brief description:
- _source_dir_ -- Directory, where to store all sources from GitHub. Defaults `/opt/recodex`. - _source_dir_ -- Directory, where to store all sources from GitHub. Defaults
- _mysql_root_password_ -- Password of root user of MySQL database. Will be set after installation and saved to `/root/.my.cnf` file. `/opt/recodex`.
- _mysql_root_password_ -- Password of root user of MySQL database. Will be set
after installation and saved to `/root/.my.cnf` file.
- _mysql_recodex_username_ -- MySQL username for ReCodEx API access. - _mysql_recodex_username_ -- MySQL username for ReCodEx API access.
- _mysql_recodex_password_ -- Password for the user above. - _mysql_recodex_password_ -- Password for the user above.
- _admin_email_ -- Email of administrator. Used when configuring Apache webserver. - _admin_email_ -- Email of administrator. Used when configuring Apache
- _recodex_hostname_ -- Hostname where the API and web app will be accessible. For example "recodex.projekty.ms.mff.cuni.cz". webserver.
- _webapp_node_addr_ -- IP address of NodeJS server running web app. Defaults to "127.0.0.1" and should not be changed. - _recodex_hostname_ -- Hostname where the API and web app will be accessible.
For example "recodex.projekty.ms.mff.cuni.cz".
- _webapp_node_addr_ -- IP address of NodeJS server running web app. Defaults to
"127.0.0.1" and should not be changed.
- _webapp_node_port_ -- Port to above. - _webapp_node_port_ -- Port to above.
- _webapp_public_addr_ -- Public address, where web server for web app will listen. Defaults to "*". - _webapp_public_addr_ -- Public address, where web server for web app will
listen. Defaults to "*".
- _webapp_public_port_ -- Port to above. - _webapp_public_port_ -- Port to above.
- _webapp_firewall_ -- Open port for web app in firewall, values "yes" or "no". - _webapp_firewall_ -- Open port for web app in firewall, values "yes" or "no".
- _webapi_public_endpoint_ -- Public URL when the API will be running, for example "https://recodex.projekty.ms.mff.cuni.cz:4000/v1". - _webapi_public_endpoint_ -- Public URL when the API will be running, for
- _webapi_public_addr_ -- Public address, where web server for API will listen. Defaults to "*". example "https://recodex.projekty.ms.mff.cuni.cz:4000/v1".
- _webapi_public_addr_ -- Public address, where web server for API will listen.
Defaults to "*".
- _webapi_public_port_ -- Port to above. - _webapi_public_port_ -- Port to above.
- _webapi_firewall_ -- Open port for API in firewall, values "yes" or "no". - _webapi_firewall_ -- Open port for API in firewall, values "yes" or "no".
- _database_firewall_ -- Open port for database in firewall, values "yes" or "no". - _database_firewall_ -- Open port for database in firewall, values "yes" or
- _broker_to_webapi_addr_ -- Address, where API can reach broker. Private one is recommended. "no".
- _broker_to_webapi_addr_ -- Address, where API can reach broker. Private one is
recommended.
- _broker_to_webapi_port_ -- Port to above. - _broker_to_webapi_port_ -- Port to above.
- _broker_firewall_api_ -- Open above port in firewall, "yes" or "no". - _broker_firewall_api_ -- Open above port in firewall, "yes" or "no".
- _broker_to_workers_addr_ -- Address, where workers can reach broker. Private one is recommended. - _broker_to_workers_addr_ -- Address, where workers can reach broker. Private
one is recommended.
- _broker_to_workers_port_ -- Port to above. - _broker_to_workers_port_ -- Port to above.
- _broker_firewall_workers_ -- Open above port in firewall, "yes" or "no". - _broker_firewall_workers_ -- Open above port in firewall, "yes" or "no".
- _broker_notifier_address_ -- URL (on API), where broker will send notifications, for example "https://recodex.projekty.ms.mff.cuni.cz/v1/broker-reports". - _broker_notifier_address_ -- URL (on API), where broker will send
- _broker_notifier_port_ -- Port to above, should be the same as for API itself (_webapi_public_port_) notifications, for example
"https://recodex.projekty.ms.mff.cuni.cz/v1/broker-reports".
- _broker_notifier_port_ -- Port to above, should be the same as for API itself
(_webapi_public_port_)
- _broker_notifier_username_ -- Username for HTTP Authentication for reports - _broker_notifier_username_ -- Username for HTTP Authentication for reports
- _broker_notifier_password_ -- Password for HTTP Authentication for reporst - _broker_notifier_password_ -- Password for HTTP Authentication for reporst
- _monitor_websocket_addr_ -- Address, where websocket connection from monitor will be available - _monitor_websocket_addr_ -- Address, where websocket connection from monitor
will be available
- _monitor_websocket_port_ -- Port to above. - _monitor_websocket_port_ -- Port to above.
- _monitor_firewall_websocket_ -- Open above port in firewall, "yes" or "no". - _monitor_firewall_websocket_ -- Open above port in firewall, "yes" or "no".
- _monitor_zeromq_addr_ -- Address, where monitor will be available on ZeroMQ socket for broker to receive reports. - _monitor_zeromq_addr_ -- Address, where monitor will be available on ZeroMQ
socket for broker to receive reports.
- _monitor_zeromq_port_ -- Port to above. - _monitor_zeromq_port_ -- Port to above.
- _monitor_firewall_zeromq_ -- Open above port in firewall, "yes" or "no". - _monitor_firewall_zeromq_ -- Open above port in firewall, "yes" or "no".
- _fileserver_addr_ -- Address, where fileserver will serve files. - _fileserver_addr_ -- Address, where fileserver will serve files.
- _fileserver_port_ -- Port to above. - _fileserver_port_ -- Port to above.
- _fileserver_firewall_ -- Open above port in firewall, "yes" or "no". - _fileserver_firewall_ -- Open above port in firewall, "yes" or "no".
- _fileserver_username_ -- Username for HTTP Authentication for access the fileserver. - _fileserver_username_ -- Username for HTTP Authentication for access the
- _fileserver_password_ -- Password for HTTP Authentication for access the fileserver. fileserver.
- _worker_cache_dir_ -- File cache storage for workers. Defaults to "/tmp/recodex/cache". - _fileserver_password_ -- Password for HTTP Authentication for access the
fileserver.
- _worker_cache_dir_ -- File cache storage for workers. Defaults to
"/tmp/recodex/cache".
- _worker_cache_age_ -- How long hold fetched files in worker cache, in seconds. - _worker_cache_age_ -- How long hold fetched files in worker cache, in seconds.
- _isolate_version_ -- Git tag of Isolate version worker depends on. - _isolate_version_ -- Git tag of Isolate version worker depends on.
### Installation itself ### Installation itself
With your computers installed with CentOS and configuration modified it is time to run the installation. With your computers installed with CentOS and configuration modified it is time
to run the installation.
``` ```
$ ansible-playbook -i development recodex.yml $ ansible-playbook -i development recodex.yml
``` ```
This command installs all components of ReCodEx onto machines listed in _development_ file. It is possible to install only specified parts of project, just use component's YAML file instead of _recodex.yml_. This command installs all components of ReCodEx onto machines listed in
_development_ file. It is possible to install only specified parts of project,
just use component's YAML file instead of _recodex.yml_.
Ansible expects to have password-less access to the remote machines. If you have not such setup, use options `--ask-pass` and `--ask-become-pass`. Ansible expects to have password-less access to the remote machines. If you have
not such setup, use options `--ask-pass` and `--ask-become-pass`.
## Manual installation
### Worker
#### Dependencies
Worker specific requirements are written in this section. It covers only basic
requirements, additional runtimes or tools may be needed depending on type of
use. The package names are for CentOS if not specified otherwise.
- ZeroMQ in version at least 4.0, packages `zeromq` and `zeromq-devel`
(`libzmq3-dev` on Debian)
- YAML-CPP library, `yaml-cpp` and `yaml-cpp-devel` (`libyaml-cpp0.5v5` and
`libyaml-cpp-dev` on Debian)
- libcurl library `libcurl-devel` (`libcurl4-gnutls-dev` on Debian)
- libarchive library as optional dependency. Installing will speed up build
process, otherwise libarchive is built from source during installation.
Package name is `libarchive` and `libarchive-devel` (`libarchive-dev` on
Debian)
**Isolate** (only for Linux installations)
First, we need to compile sandbox Isolate from source and install it. Current
worker is tested against version 1.3, so this version needs to be checked out.
Assume that we keep source code in `/opt/src` dir. For building man page you
need to have package `asciidoc` installed.
```
$ cd /opt/src
$ git clone https://github.com/ioi/isolate.git
$ cd isolate
$ git checkout v1.3
$ make
# make install && make install-doc
```
For proper work Isolate depends on several advanced features of the Linux
kernel. Make sure that your kernel is compiled with `CONFIG_PID_NS`,
`CONFIG_IPC_NS`, `CONFIG_NET_NS`, `CONFIG_CPUSETS`, `CONFIG_CGROUP_CPUACCT`,
`CONFIG_MEMCG`. If your machine has swap enabled, also check
`CONFIG_MEMCG_SWAP`. With which flags was your kernel compiled with can be found
in `/boot` directory, file `config-` and version of your kernel. Red Hat based
distributions should have these enabled by default, for Debian you you may want
to add the parameters `cgroup_enable=memory swapaccount=1` to the kernel
command-line, which can be set by adding value `GRUB_CMDLINE_LINUX_DEFAULT` to
`/etc/default/grub` file.
For better reproducibility of results, some kernel parameters can be tweaked:
- Disable address space randomization. Create file
`/etc/sysctl.d/10-recodex.conf` with content `kernel.randomize_va_space=0`.
Changes will take effect after restart or run `sysctl
kernel.randomize_va_space=0` command.
- Disable dynamic CPU frequency scaling. This requires setting the cpufreq
scaling governor to _performance_.
#### Clone worker source code repository
```
$ git clone https://github.com/ReCodEx/worker.git
$ git submodule update --init
```
#### Install worker on Linux
It is supposed that your current working directory is that one with clonned
worker source codes.
- Prepare environment running `mkdir build && cd build`
- Build sources by `cmake ..` following by `make`
- Build binary package by `make package` (may require root permissions). Note
that `rpm` and `deb` packages are build in the same time. You may need to have
`rpmbuild` command (usually as `rpmbuild` or `rpm` package) or edit
CPACK_GENERATOR variable in _CMakeLists.txt_ file in root of source code tree.
- Install generated package through your package manager (`yum`, `dnf`, `dpkg`).
The worker installation process is composed of following steps:
- create config file `/etc/recodex/worker/config-1.yml`
- create systemd unit file `/etc/systemd/system/recodex-worker@.service`
- put main binary to `/usr/bin/recodex-worker`
- put judges binaries to `/usr/bin/` directory
- create system user and group `recodex` with `/sbin/nologin` shell (if not
already existing)
- create log directory `/var/log/recodex`
- set ownership of config (`/etc/recodex`) and log (`/var/log/recodex`)
directories to `recodex` user and group
_Note:_ If you do not want to generate binary packages, you can just install the
project with `make install` (as root). But installation through your
distribution's package manager is preferred way to keep your system clean and
manageable in long term horizon.
#### Install worker on Windows
There are basically two main dependencies needed, **Windows 7** or higher and
**Visual Studio 2015+**. Provided simple installation batch script should do all
the work on Windows machine. Officially only VS2015 and 32-bit compilation is
supported, because of hardcoded compile options in installation script. If
different VS or different platform is needed, the script should be changed to
appropriate values.
Mentioned script is placed in *install* directory alongside supportive scripts
for UNIX systems and is named *win-build.cmd*. Provided script will do almost
all the work connected with building and dependency resolving (using
**NuGet** package manager and `msbuild` building system). Script should be
run under 32-bit version of _Developer Command Prompt for VS2015_ and from
*install* directory.
Building and installing of worker is then quite simple, script has command line
parameters which can be used to specify what will be done:
- *-build* -- It is the default options if none specified. Builds worker and its
tests, all is saved in *build* folder and subfolders.
- *-clean* -- Cleanup of downloaded NuGet packages and built
application/libraries.
- *-test* -- Build worker and run tests on compiled test cases.
- *-package* -- Generation of clickable installation using cpack and
[NSIS](http://nsis.sourceforge.net/) (has to be installed on machine to get
this to work).
```
install> win-build.cmd # same as: win-build.cmd -build
install> win-build.cmd -clean
install> win-build.cmd -test
install> win-build.cmd -package
```
All build binaries and cmake temporary files can be found in *build* folder,
classically there will be subfolder *Release* which will contain compiled
application with all needed dlls. Once if clickable installation binary is
created, it can be found in *build* folder under name
*recodex-worker-VERSION-win32.exe*. Sample screenshot can be found on following
picture.
![NSIS Installation](https://github.com/ReCodEx/wiki/blob/master/images/nsis_installation.png)
#### Usage
A systemd unit file is distributed with the worker to simplify its launch. It
integrates worker nicely into your Linux system and allows you to run it
automatically on system startup. It is possible to have more than one worker on
every server, so the provided unit file is templated. Each instance of the
worker unit has a unique string identifier, which is used for managing that
instance through systemd. By default, only one worker instance is ready to use
after installation and its ID is "1".
- Starting worker with id "1" can be done this way:
```
# systemctl start recodex-worker@1.service
```
Check with
```
# systemctl status recodex-worker@1.service
```
if the worker is running. You should see "active (running)" message.
- Worker can be stopped or restarted accordigly using `systemctl stop` and
`systemctl restart` commands.
- If you want to run worker after system startup, run:
```
# systemctl enable recodex-worker@1.service
```
For further information about using systemd please refer to systemd
documentation.
##### Adding new worker
To add a new worker you need to do a few steps:
- Make up an unique string ID.
- Copy default configuration file `/etc/recodex/worker/config-1.yml` to the same
directory and name it `config-<your_unique_ID>.yml`
- Edit that config file to fit your needs. Note that you must at least change
_worker-id_ and _logger file_ values to be unique.
- Run new instance using
```
# systemctl start recodex-worker@<your_unique_ID>.service
```
### Broker
#### Dependencies
Broker has similar basic dependencies as worker, for recapitulation:
- ZeroMQ in version at least 4.0, packages `zeromq` and `zeromq-devel`
(`libzmq3-dev` on Debian)
- YAML-CPP library, `yaml-cpp` and `yaml-cpp-devel` (`libyaml-cpp0.5v5` and
`libyaml-cpp-dev` on Debian)
- libcurl library `libcurl-devel` (`libcurl4-gnutls-dev` on Debian)
#### Clone broker source code repository
```
$ git clone https://github.com/ReCodEx/broker.git
$ git submodule update --init
```
#### Installation itself
Installation of broker program does following step to your computer:
- create config file `/etc/recodex/broker/config.yml`
- create _systemd_ unit file `/etc/systemd/system/recodex-broker.service`
- put main binary to `/usr/bin/recodex-broker`
- create system user and group `recodex` with nologin shell (if not existing)
- create log directory `/var/log/recodex`
- set ownership of config (`/etc/recodex`) and log (`/var/log/recodex`)
directories to `recodex` user and group
It is supposed that your current working directory is that one with clonned
worker source codes.
- Prepare environment running `mkdir build && cd build`
- Build sources by `cmake ..` following by `make`
- Build binary package by `make package` (may require root permissions). Note
that `rpm` and `deb` packages are build in the same time. You may need to have
`rpmbuild` command (usually as `rpmbuild` or `rpm` package) or edit
CPACK_GENERATOR variable _CMakeLists.txt_ file in root of source code tree.
- Install generated package through your package manager (`yum`, `dnf`, `dpkg`).
_Note:_ If you do not want to generate binary packages, you can just install the
project with `make install` (as root). But installation through your
distribution's package manager is preferred way to keep your system clean and
manageable in long term horizon.
#### Usage
Running broker is very similar to the worker setup. There is also provided
systemd unit file for convenient usage. There is only one broker per whole
ReCodEx solution, so there is no need for systemd templates.
- Running broker can be done by following command:
```
# systemctl start recodex-broker.service
```
Check with
```
# systemctl status recodex-broker.service
```
if the broker is running. You should see "active (running)" message.
- Broker can be stopped or restarted accordigly using `systemctl stop` and
`systemctl restart` commands.
- If you want to run broker after system startup, run:
```
# systemctl enable recodex-broker.service
```
For further information about using systemd please refer to systemd
documentation.
### Fileserver
To install and use the fileserver, it is necessary to have Python3 with `pip`
package manager installed. It is needed to install the dependencies. From
clonned repository run the following command:
```
$ pip install -r requirements.txt
```
That is it. Fileserver does not need any special installation. It is possible to
build and install _rpm_ package or install it without packaging the same way as
monitor, but it is only optional. The installation would provide you with script
`recodex-fileserver` in you `PATH`. No systemd unit files are provided, because
of the configuration and usage of fileserver component is much different to our
other Python parts.
#### Usage
There are several ways of running the ReCodEx fileserver. We will cover three
typical use cases.
##### Running in development mode
For simple development usage, it is possible to run the fileserver in the
command line. Allowed options are described below.
```
usage: fileserver.py [--directory WORKING_DIRECTORY]
{runserver,shell} ...
```
- **runserver** argument starts the Flask development server (i.e. `app.run()`).
As additional argument can be given a port number.
- **shell** argument instructs Flask to run a Python shell inside application
context.
Simple development server on port 9999 can be run as
```
$ python3 fileserver.py runserver 9999
```
When run like this command, the fileserver creates a temporary directory where
it stores all the files and which is deleted when it exits.
##### Running as WSGI script in a web server
If you need features such as HTTP authentication (recommended) or efficient
serving of static files, it is recommended to run the app in a full-fledged web
server (such as Apache or Nginx) using WSGI. Apache configuration can be
generated by `mkconfig.py` script from the repository.
```
usage: mkconfig.py apache [-h] [--port PORT] --working-directory
WORKING_DIRECTORY [--htpasswd HTPASSWD]
[--user USER]
```
- **port** -- port where the fileserver should listen
- **working_directory** -- directory where the files should be stored
- **htpasswd** -- path to user file for HTTP Basic Authentication
- **user** -- user under which the server should be run
##### Running using uWSGI
Another option is to run fileserver as a standalone app via uWSGI service. Setup
is also quite simple, configuration file can be also generated by `mkconfig.py`
script.
1. (Optional) Create a user for running the fileserver
2. Make sure that your user can access your clone of the repository
3. Run `mkconfig.py` script.
```
usage: mkconfig.py uwsgi [-h] [--user USER] [--port PORT]
[--socket SOCKET]
--working-directory WORKING_DIRECTORY
```
- **user** -- user under which the server should be run
- **port** -- port where the fileserver should listen
- **socket** -- path to UNIX socket where the fileserver should listen
- **working_directory** -- directory where the files should be stored
4. Save the configuration file generated by the script and run it with uWSGI,
either directly or using systemd. This depends heavily on your distribution.
5. To integrate this with another web server, see the [uWSGI
documentation](http://uwsgi-docs.readthedocs.io/en/latest/WebServers.html)
Note that the ways distributions package uWSGI can vary wildly. In Debian 8 it
is necessary to convert the configuration file to XML and make sure that the
python3 plugin is loaded instead of python. This plugin also uses Python 3.4,
even though the rest of the system uses Python 3.5 - make sure to install
dependencies for the correct version.
### Monitor
For monitor functionality there are some required packages. All of them are
listed in _requirements.txt_ file in the repository and can be installed by
`pip` package manager as
```
$ pip install -r requirements.txt
```
**Description of dependencies:**
- zmq -- binding to ZeroMQ framework
- websockets -- framework for communication over WebSockets
- asyncio -- library for fast asynchronous operations
- pyyaml -- parsing YAML configuration files
- argparse -- parsing command line arguments
Installation will provide you following files:
- `/usr/bin/recodex-monitor` -- simple startup script located in PATH
- `/etc/recodex/monitor/config.yml` -- configuration file
- `/etc/systemd/system/recodex-monitor.service` -- systemd startup script
- code files will be installed in location depending on your system settings,
mostly into `/usr/lib/python3.5/site-packages/monitor/` or similar
Systemd script runs monitor binary as specific _recodex_ user, so in `postinst`
script user and group of this name are created. Also, ownership of configuration
file will be granted to that user.
- RPM distributions can make and install binary package. This can be done like
this:
- run command
```
$ python3 setup.py bdist_rpm --post-install ./install/postints
```
to generate binary `.rpm` package or download precompiled one from releases
tab of monitor GitHub repository (it is architecture independent package)
- install package using
```
# yum install ./dist/recodex-monitor-<version>-1.noarch.rpm
```
- Other Linux distributions can install cleaner straight
```
$ python3 setup.py install --install-scripts /usr/bin
# ./install/postinst
```
#### Usage
Preferred way to start monitor as a service is via systemd as the other parts of
ReCodEx solution.
- Running monitor is fairly simple:
```
# systemctl start recodex-monitor.service
```
- Current state can be obtained by
```
# systemctl status recodex-monitor.service
```
You should see green **Active (running)**.
- Setting up monitor to be started on system startup:
```
# systemctl enable recodex-monitor.service
```
Alternatively monitor can be started directly from command line with specifying
path to configuration file. Note that this command will not start monitor as a
daemon.
```
$ recodex-monitor -c /etc/recodex/monitor/config.yml
```
### Cleaner
To install and use the cleaner, it is necessary to have Python3 with package
manager `pip` installed.
- Dependencies of cleaner has to be installed:
```
$ pip install -r requirements.txt
```
- RPM distributions can make and install binary package. This can be done like
this:
```
$ python setup.py bdist_rpm --post-install ./cleaner/install/postinst
```
- Installing generated package using YUM:
```
# yum install ./dist/recodex-cleaner-<version>-1.noarch.rpm
```
- Other Linux distributions can install cleaner straight
```
$ python setup.py install --install-scripts /usr/bin
# ./cleaner/install/postinst
```
- For Windows installation do following:
- start `cmd` with administrator permissions
- run installation with
```
> python setup.py install --install-scripts \
"C:\Program Files\ReCodEx\cleaner"
```
where path specified with `--install-scripts` can be changed
- copy configuration file alongside with installed executable using
```
> copy install\config.yml \
"C:\Program Files\ReCodEx\cleaner\config.yml"
```
#### Usage
As stated before cleaner should be cronned, on linux systems this can be done by
built in `cron` service or if there is `systemd` present cleaner itself provides
`*.timer` file which can be used for cronning from `systemd`. On Windows systems
internal scheduler should be used.
- Running cleaner from command line is fairly simple:
```
$ recodex-cleaner -c /etc/recodex/cleaner
```
- Enable cleaner service using systemd:
```
$ systemctl start recodex-cleaner.timer
```
- Add cleaner to linux cron service using following configuration line:
```
0 0 * * * /usr/bin/recodex-cleaner -c /etc/recodex/cleaner/config.yml
```
- Add cleaner to Windows cheduler service with following command:
```
> schtasks /create /sc daily /tn "ReCodEx Cleaner" /tr \
"\"C:\Program Files\ReCodEx\cleaner\recodex-cleaner.exe\" \
-c \"C:\Program Files\ReCodEx\cleaner\config.yml\""
```
### REST API
The web API requires a PHP runtime version at least 7. Which one depends on
actual configuration, there is a choice between _mod_php_ inside Apache,
_php-fpm_ with Apache or Nginx proxy or running it as standalone uWSGI script.
It is common that there are some PHP extensions, that have to be installed on
the system. Namely ZeroMQ binding (`php-zmq` package or similar), MySQL module
(`php-mysqlnd` package) and ldap extension module for CAS authentication
(`php-ldap` package). Make sure that the extensions are loaded in your `php.ini`
file (`/etc/php.ini` or files in `/etc/php.d/`).
The API depends on some other projects and libraries. For managing them
[Composer](https://getcomposer.org/) is used. It can be installed from system
repositories or downloaded from the website, where detailed instructions are as
well. Composer reads `composer.json` file in the project root and installs
dependencies to the `vendor/` subdirectory. To do that, run:
```
$ composer install
```
#### Database preparation
When the API is installed and configured (_doctrine_ section is sufficient here)
the database schema can be generated. There is a prepared command to do that
from command line:
```
$ php www/index.php orm:schema-tool:update --force
```
With API comes some initial values, for example default user roles with proper
permissions. To fill your database with these values there is another command
line command:
```
$ php www/index.php db:fill
```
Check the outputs of both commands for errors. If there are any, try to clean
temporary API cache in `temp/cache/` directory and repeat the action.
#### Webserver configuration
The simplest way to get started is to start the built-in PHP server in the root
directory of your project:
```
$ php -S localhost:4000 -t www
```
Then visit `http://localhost:4000` in your browser to see the welcome page of
API project.
For Apache or Nginx, setup a virtual host to point to the `www/` directory of
the project and you should be ready to go. It is **critical** that whole `app/`,
`log/` and `temp/` directories are not accessible directly via a web browser
(see [security warning](https://nette.org/security-warning)). Also it is
**highly recommended** to set up a HTTPS certificate for public access to the
API.
#### Troubleshooting
In case of any issues first remove the Nette cache directory `temp/cache/` and
try again. This solves most of the errors. If it does not help, examine API logs
from `log/` directory of the API source or logs of your webserver.
### Web application
Web application requires [NodeJS](https://nodejs.org/en/) server as its runtime
environment. This runtime is needed for executing JavaScript code on server and
sending the pre-render parts of pages to clients, so the final rendering in
browsers is a lot quicker and the page is accessible to search engines for
indexing.
But some functionality is better in other full fledged web servers like *Apache*
or *Nginx*, so the common practice is to use a tandem of both. *NodeJS* takes
care of basic functionality of the app while the other server (Apache) is set as
reverse proxy and providing additional functionality like SSL encryption, load
balancing or caching of static files. The recommended setup contains both NodeJS
and one of Apache and Nginx web servers for the reasons discussed above.
Stable versions of 4th and 6th series of NodeJS server are sufficient, using at
least 6th series is highly recommended. Please check the most recent version of
the packages in your distribution's repositories, there are often outdated ones.
However, there are some third party repositories for all main Linux
distributions.
The app depends on several libraries and components, all of them are listed in
`package.json` file in source repository. For managing dependencies is used node
package manager (`npm`), which can come with NodeJS installation otherwise can
be installed separately. To fetch and install all dependencies run:
```
$ npm install
```
For easy production usage there is an additional package for managing NodeJS
processes, `pm2`. This tool can run your application as a daemon, monitor
occupied resources, gather logs and provide simple console interface for
managing app's state. To install it globally into your system run:
```
# npm install pm2 -g
```
#### Usage
The application can be run in two modes, development and production. Development
mode uses only client rendering and tracks code changes with rebuilds of the
application in real time. In production mode the compilation (transpile to _ES5_
standard using *Babel* and bundle into single file using *webpack*) has to be
done separately prior to running. The scripts for compilation are provided as
additional `npm` commands.
- Development mode can be use for local testing of the app. This mode uses
webpack dev server, so all code runs on a client, there is no server side
rendering available. Starting is simple command, default address is
http://localhost:8080.
```
$ npm run dev
```
- Production mode is mostly used on the servers. It provides all features such
as server side rendering. This can be run via:
```
$ npm run build
$ npm start
```
Both modes can be configured to use different ports or set base address of used
API server. This can be configured in `.env` file in root of the repository.
There is `.env-sample` file which can be just copied and altered.
The production mode can be run also as a demon controled by `pm2` tool. First
the web application has to be built and then the server javascript file can run
as a daemon.
```
$ npm run build
$ pm2 start bin/server.js
```
The `pm2` tool has several options, most notably _status_, _stop_, _restart_ and
_logs_. Further description is available on project
[website](http://pm2.keymetrics.io).
## Security ## Security
One of the most important aspects of ReCodEx instance is security. It is crucial to keep gathered data safe and not to allow unauthorized users modify restricted pieces of information. Here is a small list of recommendations to keep running ReCodEx instance safe. One of the most important aspects of ReCodEx instance is security. It is crucial
to keep gathered data safe and not to allow unauthorized users modify restricted
pieces of information. Here is a small list of recommendations to keep running
ReCodEx instance safe.
- Secure MySQL installation. The installation script does not do any security actions, so please run at least `mysql_secure_installation` script on database computer. - Secure MySQL installation. The installation script does not do any security
- Get HTTPS certificate and set it in Apache for web application and API. Monitor should be proxied through the web server too with valid certificate. You can get free DV certificate from [Let's Encrypt](https://letsencrypt.org/). Do not forget to set up automatic renewing! actions, so please run at least `mysql_secure_installation` script on database
- Hide broker, workers and fileserver behind firewall, private subnet or IPsec tunnel. They are not required to be reached from public internet, so it is better keep them isolated. computer.
- Keep your server updated and well configured. For automatic installation of security updates on CentOS system refer to `yum-cron` package. Configure SSH and Apache to use only strong ciphers, some recommendations can be found [here](https://bettercrypto.org/static/applied-crypto-hardening.pdf). - Get HTTPS certificate and set it in Apache for web application and API.
- Do not put actually used credentials on web, for example do not commit your passwords (in Ansible variables file) on GitHub. Monitor should be proxied through the web server too with valid certificate.
You can get free DV certificate from [Let's
Encrypt](https://letsencrypt.org/). Do not forget to set up automatic
renewing!
- Hide broker, workers and fileserver behind firewall, private subnet or IPsec
tunnel. They are not required to be reached from public internet, so it is
better keep them isolated.
- Keep your server updated and well configured. For automatic installation of
security updates on CentOS system refer to `yum-cron` package. Configure SSH
and Apache to use only strong ciphers, some recommendations can be found
[here](https://bettercrypto.org/static/applied-crypto-hardening.pdf).
- Do not put actually used credentials on web, for example do not commit your
passwords (in Ansible variables file) on GitHub.
- Regularly check logs for anomalies. - Regularly check logs for anomalies.
<!---
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
-->

@ -11,18 +11,6 @@ Monitor is needed one per broker, that is one per separate ReCodEx instance. Als
Monitor is written in Python, tested versions are 3.4 and 3.5. This language was chosen because it is already in project requirements (fileserver) and there are great libraries for ZeroMQ, WebSockets and asynchronous operations. This library saves system resources and provides us great amount of processed messages. Also, coding in Python was pretty simple and saves us time for improving the other parts of ReCodEx. Monitor is written in Python, tested versions are 3.4 and 3.5. This language was chosen because it is already in project requirements (fileserver) and there are great libraries for ZeroMQ, WebSockets and asynchronous operations. This library saves system resources and provides us great amount of processed messages. Also, coding in Python was pretty simple and saves us time for improving the other parts of ReCodEx.
For monitor functionality there are some required packages. All of them are listed in _requirements.txt_ file in the repository and can be installed by `pip` package manager as
```
$ pip install -r requirements.txt
```
**Description of dependencies:**
- zmq -- binding to ZeroMQ framework
- websockets -- framework for communication over WebSockets
- asyncio -- library for fast asynchronous operations
- pyyaml -- parsing YAML configuration files
- argparse -- parsing command line arguments
### Message flow ### Message flow
@ -40,94 +28,3 @@ There can be multiple receivers to one channel id. Each one has separate _asynci
Messages from client's queue are sent through corresponding WebSocket connection via main event loop as soon as possible. This approach with separate queue per connection is easy to implement and guarantees reliability and order of message delivery. Messages from client's queue are sent through corresponding WebSocket connection via main event loop as soon as possible. This approach with separate queue per connection is easy to implement and guarantees reliability and order of message delivery.
## Installation
Installation will provide you following files:
- `/usr/bin/recodex-monitor` -- simple startup script located in PATH
- `/etc/recodex/monitor/config.yml` -- configuration file
- `/etc/systemd/system/recodex-monitor.service` -- systemd startup script
- code files will be installed in location depending on your system settings, mostly into `/usr/lib/python3.5/site-packages/monitor/` or similar
Systemd script runs monitor binary as specific _recodex_ user, so in `postinst` script user and group of this name are created. Also, ownership of configuration file will be granted to that user.
- RPM distributions can make and install binary package. This can be done like this:
- run command
```
$ python3 setup.py bdist_rpm --post-install ./install/postints
```
to generate binary `.rpm` package or download precompiled one from releases tab of monitor GitHub repository (it is architecture independent package)
- install package using
```
# yum install ./dist/recodex-monitor-<version>-1.noarch.rpm
```
- Other Linux distributions can install cleaner straight
```
$ python3 setup.py install --install-scripts /usr/bin
# ./install/postinst
```
## Configuration and usage
### Configuration
Configuration file is located in subdirectory `monitor` of standard ReCodEx configuration folder `/etc/recodex/`. It is in YAML format as all of the other configurations. Format is very similar to configurations of broker or workers.
### Configuration items
Description of configurable items, bold ones are required, italics ones are optional.
- _websocket_uri_ -- URI where is the endpoint of websocket connection. Must be visible to the clients (directly or through public proxy)
- string representation of IP address or a hostname
- port number
- _zeromq_uri_ -- URI where is the endpoint of zeromq connection from broker. Could be hidden from public internet.
- string representation of IP address or a hostname
- port number
- _logger_ -- settings of logging
- _file_ -- path with name of log file. Defaults to `/var/log/recodex/monitor.log`
- _level_ -- logging level, one of "debug", "info", "warning", "error" and "critical"
- _max-size_ -- maximum size of log file before rotation in bytes
- _rotations_ -- number of rotations kept
### Example configuration file
```{.yml}
---
websocket_uri:
- "127.0.0.1"
- 4567
zeromq_uri:
- "127.0.0.1"
- 7894
logger:
file: "/var/log/recodex/monitor.log"
level: "debug"
max-size: 1048576 # 1 MB
rotations: 3
...
```
### Usage
Preferred way to start monitor as a service is via systemd as the other parts of ReCodEx solution.
- Running monitor is fairly simple:
```
# systemctl start recodex-monitor.service
```
- Current state can be obtained by
```
# systemctl status recodex-monitor.service
```
You should see green **Active (running)**.
- Setting up monitor to be started on system startup:
```
# systemctl enable recodex-monitor.service
```
Alternatively monitor can be started directly from command line.
Note that this command will not start monitor as a daemon.
```
$ recodex-monitor -c /etc/recodex/monitor/config.yml
```

@ -1,251 +0,0 @@
# Overall Architecture
## Description
**ReCodEx** is designed to be very modular and configurable. One such configuration is sketched in the following picture. There are two separate frontend instances with distinct databases sharing common backend part. This configuration may be suitable for MFF UK -- basic programming course and KSP competition. Note, that connections between components are not fully accurate.
![Overall architecture](https://github.com/ReCodEx/wiki/blob/master/images/Overall_Architecture.png)
**Web app** is main part of whole project from user point of view. It provides nice user interface and it is the only part, that interacts with outside world directly. **Web API** contains almost all logic of the app including _user management and authentication_, _storing and versioning files_ (with help of **File server**), _counting and assigning points_ to users etc. Advanced users may connect to the API directly or may create custom frontends. **Broker** is essential part of whole architecture. It maintains list of available **Workers**, receives submissions from the **Web API** and routes them further and reports progress of evaluations back to the **Web app**. **Worker** securely runs each received job and evaluate its results. **Monitor** resends evaluation progress messages to the **Web app** in order to be presented to users.
## Communication
Detailed communication inside the ReCodEx system is captured in the following
image and described in sections below. Red connections are through ZeroMQ
sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ
messages are sent as multipart with one string (command, option) per part, with
no empty frames (unles explicitly specified otherwise).
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
### Broker - Worker communication
Broker acts as server when communicating with worker. Listening IP address and port are configurable, protocol family is TCP. Worker socket is of DEALER type, broker one is ROUTER type. Because of that, very first part of every (multipart) message from broker to worker must be target worker's socket identity (which is saved on its **init** command).
#### Commands from broker to worker:
- **eval** -- evaluate a job. Requires 3 message frames:
- `job_id` -- identifier of the job (in ASCII representation -- we avoid
endianness issues and also support alphabetic ids)
- `job_url` -- URL of the archive with job configuration and submitted source
code
- `result_url` -- URL where the results should be stored after evaluation
- **intro** -- introduce yourself to the broker (with **init** command) -- this is
required when the broker loses track of the worker who sent the command.
Possible reasons for such event are e.g. that one of the communicating sides
shut down and restarted without the other side noticing.
- **pong** -- reply to **ping** command, no arguments
#### Commands from worker to broker:
- **init** -- introduce self to the broker. Useful on startup or after reestablishing lost connection. Requires at least 2 arguments:
- `hwgroup` -- hardware group of this worker
- `header` -- additional header describing worker capabilities. Format must
be `header_name=value`, every header shall be in a separate message frame.
There is no limit on number of headers.
There is also an optional third argument -- additional information. If
present, it should be separated from the headers with an empty frame. The
format is the same as headers. Supported keys for additional information are:
- `description` -- a human readable description of the worker for
administrators (it will show up in broker logs)
- `current_job` -- an identifier of a job the worker is now processing. This
is useful when we are reassembling a connection to the broker and need it
to know the worker will not accept a new job.
- **done** -- notifying of finished job. Contains following message frames:
- `job_id` -- identifier of finished job
- `result` -- response result, possible values are:
- OK -- evaluation finished successfully
- FAILED -- job failed and cannot be reassigned to another worker (e.g.
due to error in configuration)
- INTERNAL_ERROR -- job failed due to internal worker error, but another
worker might be able to process it (e.g. downloading a file failed)
- `message` -- a human readable error message
- **progress** -- notice about current evaluation progress. Contains following message frames:
- `job_id` -- identifier of current job
- `state` -- what is happening now.
- DOWNLOADED -- submission successfuly fetched from fileserver
- FAILED -- something bad happened and job was not executed at all
- UPLOADED -- results are uploaded to fileserver
- STARTED -- evaluation of tasks started
- ENDED -- evaluation of tasks is finished
- ABORTED -- evaluation of job encountered internal error, job will be rescheduled to another worker
- FINISHED -- whole execution is finished and worker ready for another job execution
- TASK -- task state changed -- see below
- `task_id` -- only present for "TASK" state -- identifier of task in current job
- `task_state` -- only present for "TASK" state -- result of task evaluation. One of:
- COMPLETED -- task was successfully executed without any error, subsequent task will be executed
- FAILED -- task ended up with some error, subsequent task will be skipped
- SKIPPED -- some of the previous dependencies failed to execute, so this task will not be executed at all
- **ping** -- tell broker I am alive, no arguments
#### Heartbeating
It is important for the broker and workers to know if the other side is still
working (and connected). This is achieved with a simple heartbeating protocol.
The protocol requires the workers to send a **ping** command regularly (the
interval is configurable on both sides -- future releases might let the worker
send its ping interval with the **init** command). Upon receiving a **ping**
command, the broker responds with **pong**.
Whenever a heartbeating message doesn't arrive, a counter called _liveness_ is
decreased. When this counter drops to zero, the other side is considered
disconnected. When a message arrives, the liveness counter is set back to its
maximum value, which is configurable for both sides.
When the broker decides a worker disconnected, it tries to reschedule its jobs
to other workers.
If a worker thinks the broker crashed, it tries to reconnect periodically, with
a bounded, exponentially increasing delay.
This protocol proved great robustness in real world testing. Thus whole backend
is reliable and can outlive short term issues with connection without problems.
Also, increasing delay of ping messages does not flood the network when there
are problems. We experienced no issues since we are using this protocol.
### Worker - File Server communication
Worker is communicating with file server only from _execution thread_. Supported
protocol is HTTP optionally with SSL encryption (**recommended**). If supported
by server and used version of libcurl, HTTP/2 standard is also available. File
server should be set up to require basic HTTP authentication and worker is
capable to send corresponding credentials with each request.
#### Worker side
Workers comunicate with the file server in both directions -- they download
student's submissions and then upload evaluation results. Internally, worker is
using libcurl C library with very similar setup. In both cases it can verify
HTTPS certificate (on Linux against system cert list, on Windows against
downloaded one from CURL website during installation), support basic HTTP
authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error
(returned HTTP status code is >=400). Worker have list of credentials to all
available file servers in its config file.
- download file -- standard HTTP GET request to given URL expecting file content as response
- upload file -- standard HTTP PUT request to given URL with file data as body -- same as command line tool `curl` with option `--upload-file`
#### File server side
File server has its own internal directory structure, where all the files are stored. It provides simple REST API to get them or create new ones. File server does not provide authentication or secured connection by itself, but it is supposed to run file server as WSGI script inside a web server (like Apache) with proper configuration. Relevant commands for communication with workers:
- **GET /submission_archives/\<id\>.\<ext\>** -- gets an archive with submitted source code and corresponding configuration of this job evaluation
- **GET /exercises/\<hash\>** -- gets a file, common usage is for input files or
reference result files
- **PUT /results/\<id\>.\<ext\>** -- upload archive with evaluation results under specified name (should be same _id_ as name of submission archive). On successful upload returns JSON `{ "result": "OK" }` as body of returned page.
If not specified otherwise, `zip` format of archives is used. Symbol `/` in API description is root of file server's domain. If the domain is for example `fs.recodex.org` with SSL support, getting input file for one task could look as GET request to `https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c`.
### Broker - Monitor communication
Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
socket is same on both sides, ROUTER. Monitor is set to act as server in this
communication, its IP address and port are configurable in monitor's config
file. ZeroMQ socket ID (set on monitor's side) is "recodex-monitor" and must be
sent as first frame of every multipart message -- see ZeroMQ ROUTER socket
documentation for more info.
Note that the monitor is designed so that it can receive data both from the
broker and workers. The current architecture prefers the broker to do all the
communication so that the workers do not have to know too many network services.
Monitor is treated as a somewhat optional part of whole solution, so no special
effort on communication realibility was made.
#### Commands from monitor to broker:
Because there is no need for the monitor to communicate with the broker, there
are no commands so far. Any message from monitor to broker is logged and
discarded.
Commands from broker to monitor:
- **progress** -- notification about progress with job evaluation. See [Progress callback](#progress-callback) section for more info.
### Broker - Web API communication
Broker communicates with main REST API through ZeroMQ connection over TCP. Socket
type on broker side is ROUTER, on frontend part it is DEALER. Broker acts as a
server, its IP address and port is configurable in the API.
#### Commands from API to broker:
- **eval** -- evaluate a job. Requires at least 4 frames:
- `job_id` -- identifier of this job (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
- `header` -- additional header describing worker capabilities. Format must be `header_name=value`, every header shall be in a separate message frame. There is no maximum limit on number of headers. There may be also no headers at all. A worker is considered suitable for the job if and only if it satisfies all of its headers.
- empty frame -- frame which contains only empty string and serves only as breakpoint after headers
- `job_url` -- URI location of archive with job configuration and submitted source code
- `result_url` -- remote URI where results will be pushed to
#### Commands from broker to API (all are responses to **eval** command):
- **ack** -- this is first message which is sent back to frontend right after eval command arrives, basically it means "Hi, I am all right and am capable of receiving job requests", after sending this broker will try to find acceptable worker for arrived request
- **accept** -- broker is capable of routing request to a worker
- **reject** -- broker cannot handle this job (for example when the requirements
specified by the headers cannot be met). There are (rare) cases when the
broker finds that it cannot handle the job after it was confirmed. In such
cases it uses the frontend REST API to mark the job as failed.
#### Asynchronous communication between broker and API
Only a fraction of the errors that can happen during evaluation can be detected
while there is a ZeroMQ connection between the API and broker. To notify the
frontend of the rest, we need an asynchronous communication channel that can be
used by the broker when the status of a job changes (it's finished, it failed
permanently, the only worker capable of processing it disconnected...).
This functionality is supplied by the `broker-reports/` API endpoint group --
see its documentation for more details.
### File Server - Web API communication
File server has a REST API for interaction with other parts of ReCodEx. Description of communication with workers is in [File server side](#file-server-side) section. On top of that, there are other commands for interaction with the API:
- **GET /results/\<id\>.\<ext\>** -- download archive with evaluated results of job _id_
- **POST /submissions/\<id\>** -- upload new submission with identifier _id_. Expects that the body of the POST request uses file paths as keys and the content of the files as values. On successful upload returns JSON `{ "archive_path": <archive_url>, "result_path": <result_url> }` in response body. From _archive_path_ the submission can be downloaded (by worker) and corresponding evaluation results should be uploaded to _result_path_.
- **POST /tasks** -- upload new files, which will be available by names equal to `sha1sum` of their content. There can be uploaded more files at once. On successful upload returns JSON `{ "result": "OK", "files": <file_list> }` in response body, where _file_list_ is dictionary of original file name as key and new URL with already hashed name as value.
There are no plans yet to support deleting files from this API. This may change in time.
Web API calls these fileserver endpoints with standard HTTP requests. There are no special commands involved. There is no communication in opposite direction.
### Monitor - Web app communication
Monitor interacts with web application through WebSocket connection. Monitor acts as server and browsers are connecting to it. IP address and port are configurable. When client connects to the monitor, it sends a message with string representation of channel id (which messages are interested in, usually id of evaluating job). There can be multiple listeners per channel, even (shortly) delayed connections will receive all messages from the very beginning.
When monitor receives **progress** message from broker there are two options:
- there is no WebSocket connection for listed channel (job id) -- message is dropped
- there is active WebSocket connection for listed channel -- message is parsed into JSON format (see below) and send as string to that established channel. Messages for active connections are queued, so no messages are discarded even on heavy workload.
Message JSON format is dictionary (associative array) with keys:
- **command** -- type of progress, one of:
- DOWNLOADED -- submission successfuly fetched from fileserver
- FAILED -- something bad happened and job was not executed at all
- UPLOADED -- results are uploaded to fileserver
- STARTED -- evaluation of tasks started
- ENDED -- evaluation of all tasks finished, worker now just have to send results and cleanup after execution
- ABORTED -- evaluation of job encountered internal error, job will be rescheduled to another worker
- FINISHED -- whole execution finished and worker is ready for another job execution
- TASK -- task state changed, further information will be provided -- see below
- **task_id** -- id of currently evaluated task. Present only if **command** is "TASK".
- **task_state** -- state of task with id **task_id**. Present only if **command** is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".
- COMPLETED -- task was successfully executed without any error, subsequent task will be executed
- FAILED -- task ended up with some error, subsequent task will be skipped
- SKIPPED -- some of the previous dependencies failed to execute, so this task will not be executed at all
### Web app - Web API communication
Provided web application runs as javascript client inside user's browser. It communicates with REST API on the server through standard HTTP requests. Documentation of the main REST API is in separate [document](https://recodex.github.io/api/) due to its extensiveness. Results are returned as JSON payload, which is simply parsed in web application and presented to the users.

File diff suppressed because it is too large Load Diff

@ -0,0 +1,437 @@
# System configuration
## Worker
Worker should have some default configuration which is applied to worker itself
or may be used in given jobs (implicitly if something is missing, or explicitly
with special variables). This configuration should be hardcoded and can be
rewritten by explicitly declared configuration file. Format of this
configuration is yaml with similar structure to job configuration.
### Configuration items
Mandatory items are bold, optional italic.
- **worker-id** -- unique identification of worker at one server. This id is
used by _isolate_ sanbox on linux systems, so make sure to meet isolate's
requirements (default is number from 1 to 999).
- _worker-description_ -- human readable description of this worker
- **broker-uri** -- URI of the broker (hostname, IP address, including port,
...)
- _broker-ping-interval_ -- time interval how often to send ping messages to
broker. Used units are milliseconds.
- _max-broker-liveness_ -- specifies how many pings in a row can broker miss
without making the worker dead.
- _headers_ -- map of headers specifies worker's capabilities
- _env_ -- list of enviromental variables which are sent to broker in init
command
- _threads_ -- information about available threads for this worker
- **hwgroup** -- hardware group of this worker. Hardware group must specify
worker hardware and software capabilities and it is main item for broker
routing decisions.
- _working-directory_ -- where will be stored all needed files. Can be the same
for multiple workers on one server.
- **file-managers** -- addresses and credentials to all file managers used (eq.
all different frontends using this worker)
- **hostname** -- URI of file manager
- _username_ -- username for http authentication (if needed)
- _password_ -- password for http authentication (if needed)
- _file-cache_ -- configuration of caching feature
- _cache-dir_ -- path to caching directory. Can be the same for multiple
workers.
- _logger_ -- settings of logging capabilities
- _file_ -- path to the logging file with name without suffix.
`/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`,
...
- _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`,
`err`, `warn`, `notice`, `info` and `debug`
- _max-size_ -- maximal size of log file before rotating
- _rotations_ -- number of rotation kept
- _limits_ -- default sandbox limits for this worker. All items are described in
assignments section in job configuration description. If some limits are not
set in job configuration, defaults from worker config will be used. In such
case the worker's defaults will be set as the maximum for the job. Also,
limits in job configuration cannot exceed limits from worker.
### Example config file
```{.yml}
worker-id: 1
broker-uri: tcp://localhost:9657
broker-ping-interval: 10 # milliseconds
max-broker-liveness: 10
headers:
env:
- c
- cpp
threads: 2
hwgroup: "group1"
working-directory: /tmp/recodex
file-managers:
- hostname: "http://localhost:9999" # port is optional
username: "" # can be ignored in specific modules
password: "" # can be ignored in specific modules
file-cache: # only in case that there is cache module
cache-dir: "/tmp/recodex/cache"
logger:
file: "/var/log/recodex/worker" # w/o suffix - actual names will
# be worker.log, worker.1.log,...
level: "debug" # level of logging
max-size: 1048576 # 1 MB; max size of file before log rotation
rotations: 3 # number of rotations kept
limits:
time: 5 # in secs
wall-time: 6 # seconds
extra-time: 2 # seconds
stack-size: 0 # normal in KB, but 0 means no special limit
memory: 50000 # in KB
parallel: 1
disk-size: 50
disk-files: 5
environ-variable:
ISOLATE_BOX: "/box"
ISOLATE_TMP: "/tmp"
bound-directories:
- src: /tmp/recodex/eval_5
dst: /evaluate
mode: RW,NOEXEC
```
### Isolate sandbox
New feature in version 1.3 is possibility of limit Isolate box to one or more
cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism
and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and
`cpuset.mems` which should be just fine for sandbox purposes. As kernel
functionality further description can be found in manual page of _cpuset_ or in
Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As
previously stated this settings can be applied for particular isolate boxes and
has to be written in isolate configuration. Standard configuration path should
be `/usr/local/etc/isolate` but it may depend on your installation process.
Configuration of _cpuset_ in there is really simple and is described in example
below.
```
box0.cpus = 0 # assign processor with ID 0 to isolate box with ID 0
box0.mems = 0 # assign memory node with ID 0
# if not set, linux by itself will decide where should
# the sandboxed programs run at
box2.cpus = 1-3 # assign range of processors to isolate box 2
box2.mems = 4-7 # assign range of memory nodes
box3.cpus = 1,2,3 # assign list of processors to isolate box 3
```
- **cpuset.cpus:** Cpus limitation will restrict sandboxed program only to
processor threads set in configuration. On hyperthreaded processors this means
that all virtual threads are assignable, not only the physical ones. Value can
be represented by single number, list of numbers separated by commas or range
with hyphen delimiter.
- **cpuset.mems:** This value is particularly handy on NUMA systems which has
several memory nodes. On standard desktop computers this value should always
be zero because only one independent memory node is present. As stated in
`cpus` limitation there can be single value, list of values separated by comma
or range stated with hyphen.
## Broker
### Configuration items
Description of configurable items in broker's config. Mandatory items are bold,
optional italic.
- _clients_ -- specifies address and port to bind for clients (frontend
instance)
- _address_ -- hostname or IP address as string (`*` for any)
- _port_ -- desired port
- _workers_ -- specifies address and port to bind for workers
- _address_ -- hostname or IP address as string (`*` for any)
- _port_ -- desired port
- _max_liveness_ -- maximum amount of pings the worker can fail to send
before it is considered disconnected
- _max_request_failures_ -- maximum number of times a job can fail (due to
e.g. worker disconnect or a network error when downloading something from
the fileserver) and be assigned again
- _monitor_ -- settings of monitor service connection
- _address_ -- IP address of running monitor service
- _port_ -- desired port
- _notifier_ -- details of connection which is used in case of errors and good
to know states
- _address_ -- address where frontend API runs
- _port_ -- desired port
- _username_ -- username which can be used for HTTP authentication
- _password_ -- password which can be used for HTTP authentication
- _logger_ -- settings of logging capabilities
- _file_ -- path to the logging file with name without suffix.
`/var/log/recodex/broker` item will produce `broker.log`, `broker.1.log`,
...
- _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`,
`err`, `warn`, `notice`, `info` and `debug`
- _max-size_ -- maximal size of log file before rotating
- _rotations_ -- number of rotation kept
### Example config file
```{.yml}
# Address and port for clients (frontend)
clients:
address: "*"
port: 9658
# Address and port for workers
workers:
address: "*"
port: 9657
max_liveness: 10
max_request_failures: 3
monitor:
address: "127.0.0.1"
port: 7894
notifier:
address: "127.0.0.1"
port: 8080
username: ""
password: ""
logger:
file: "/var/log/recodex/broker" # w/o suffix - actual names will be
# broker.log, broker.1.log, ...
level: "debug" # level of logging
max-size: 1048576 # 1 MB; max size of file before log rotation
rotations: 3 # number of rotations kept
```
## Monitor
Configuration file is located in subdirectory `monitor` of standard ReCodEx
configuration folder `/etc/recodex/`. It is in YAML format as all of the other
configurations. Format is very similar to configurations of broker or workers.
### Configuration items
Description of configurable items, bold ones are required, italics ones are
optional.
- _websocket_uri_ -- URI where is the endpoint of websocket connection. Must be
visible to the clients (directly or through public proxy)
- string representation of IP address or a hostname
- port number
- _zeromq_uri_ -- URI where is the endpoint of zeromq connection from broker.
Could be hidden from public internet.
- string representation of IP address or a hostname
- port number
- _logger_ -- settings of logging
- _file_ -- path with name of log file. Defaults to
`/var/log/recodex/monitor.log`
- _level_ -- logging level, one of "debug", "info", "warning", "error" and
"critical"
- _max-size_ -- maximum size of log file before rotation in bytes
- _rotations_ -- number of rotations kept
### Example configuration file
```{.yml}
---
websocket_uri:
- "127.0.0.1"
- 4567
zeromq_uri:
- "127.0.0.1"
- 7894
logger:
file: "/var/log/recodex/monitor.log"
level: "debug"
max-size: 1048576 # 1 MB
rotations: 3
...
```
## Cleaner
### Configuration items
- **cache-dir** -- directory which cleaner manages
- **file-age** -- file age in seconds which are considered outdated and will be deleted
### Example configuration
```{.yml}
cache-dir: "/tmp"
file-age: "3600" # in seconds
```
## REST API
The API can be configured in `config.neon` and `config.local.neon` files in
`app/config` directory. The first file is predefined by authors and should not
be modified. The second one is not present and could be created by copying
`config.local.neon.example` template in the config directory. Local
configuration have higher precedence, so it will override default values from
`config.neon`.
### Configurable items
Description of configurable items. All timeouts are in milliseconds if not
stated otherwise.
- accessManager -- configuration of access token in [JWT
standard](https://www.rfc-editor.org/rfc/rfc7519.txt). Do **not** modify
unless you really know what are you doing.
- fileServer -- connection to fileserver
- address -- URI of fileserver
- auth -- _username_ and _password_ for HTTP basic authentication
- timeouts -- _connection_ timeout for establishing new connection and
_request_ timeout for completing one request
- broker -- connection to broker
- address -- URI of broker
- auth -- _username_ and _password_ for broker callback authentication back
to API
- timeouts -- _ack_ timeout for first response that broker receives the
message, _send_ timeout how long try to send new job to the broker and
_result_ timeout how long to wait for confirmation if job can be processed
or not
- monitor -- connection to monitor
- address -- URI of monitor
- CAS -- CAS external authentication
- serviceId -- visible identifier of this service
- ldapConnection -- parameters for connecting to LDAP, _hostname_,
_base_dn_, _port_, _security_ and _bindName_
- fields -- names of LDAP keys for informations as _email_, _firstName_ and
_lastName_
- emails -- common configuration for sending email (addresses and template
variables)
- apiUrl -- base URL of API server including port (for referencing pictures
in messages)
- footerUrl -- link in the message footer
- siteName -- name of frontend (ReCodEx, or KSP for unique instance for KSP
course)
- githubUrl -- URL to GitHub repository of this project
- from -- sending email address
- failures -- admin messages on errors
- emails -- additional info for sending mails, _to_ is admin mail address,
_from_ is source address, _subjectPrefix_ is prefix of mail subject
- forgottenPassword -- user messages for changing passwords
- redirectUrl -- URL of web application where the password can be changed
- tokenExpiration -- expiration timeout of temporary token (in seconds)
- emails -- additional info for sending mails, _from_ is source address and
_subjectPrefix_ is prefix of mail subject
- mail -- configuration of sending mails
- smtp -- using SMTP server, have to be "true"
- host -- address of the server
- port -- sending port (common values are 25, 465, 587)
- username -- login to the server
- password -- password to the server
- secure -- security, values are empty for no security, "ssl" or "tls"
- context -- additional parameters, depending on used mail engine. For
examle self-signed certificates can be allowed as _verify_peer_ and
_verify_peer_name_ to false and _allow_self_signed_ to true under _ssl_
key (see example).
Outside the parameters section of configuration is configuration for Doctrine.
It is ORM framework which maps PHP objects (entities) into database tables and
rows. The configuration is simple, required items are only _user_, _password_
and _host_ with _dbname_, i.e. address of database computer (mostly localhost)
with name of ReCodEx database.
### Example local configuration file
```{.yml}
parameters:
accessManager:
leeway: 60
issuer: https://recodex.projekty.ms.mff.cuni.cz
audience: https://recodex.projekty.ms.mff.cuni.cz
expiration: 86400 # 24 hours in seconds
usedAlgorithm: HS256
allowedAlgorithms:
- HS256
verificationKey: "recodex-123"
fileServer:
address: http://127.0.0.1:9999
auth:
username: "user"
password: "pass"
timeouts:
connection: 500
broker:
address: tcp://127.0.0.1:9658
auth:
username: "user"
password: "pass"
timeouts:
ack: 100
send: 5000
result: 1000
monitor:
address: wss://recodex.projekty.ms.mff.cuni.cz:4443/ws
CAS:
serviceId: "cas-uk"
ldapConnection:
hostname: "ldap.cuni.cz"
base_dn: "ou=people,dc=cuni,dc=cz"
port: 389
security: SSL
bindName: "cunipersonalid"
fields:
email: "mail"
firstName: "givenName"
lastName: "sn"
emails:
apiUrl: https://recodex.projekty.ms.mff.cuni.cz:4000
footerUrl: https://recodex.projekty.ms.mff.cuni.cz
siteName: "ReCodEx"
githubUrl: https://github.com/ReCodEx
from: "ReCodEx <noreply@example.com>"
failures:
emails:
to: "Admin Name <admin@example.com>"
from: %emails.from%
subjectPrefix: "ReCodEx Failure Report - "
forgottenPassword:
redirectUrl: "https://recodex.projekty.ms.mff.cuni.cz/
forgotten-password/change"
tokenExpiration: 600 # 10 minues
emails:
from: %emails.from%
subjectPrefix: "ReCodEx Forgotten Password Request - "
mail:
smtp: true
host: "smtp.ps.stdin.cz"
port: 587
username: "user"
password: "pass"
secure: "tls"
context:
ssl:
verify_peer: false
verify_peer_name: false
allow_self_signed: true
doctrine:
user: "user"
password: "pass"
host: localhost
dbname: "recodex-api"
```
## Web application
### Configurable items
Description of configurable options. Bold are required values, optional ones are
in italics.
- **NODE_ENV** -- mode of the server
- **API_BASE** -- base address of API server, including port and API version
- **PORT** -- port where the app is listening
- _WEBPACK_DEV_SERVER_PORT_ -- port for webpack dev server when running in
development mode. Default one is 8081, this option might be useful when this
port is necessary for some other service.
### Example configuration file
```
NODE_ENV=production
API_BASE=https://recodex.projekty.ms.mff.cuni.cz:4000/v1
PORT=8080
```
<!---
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
-->

@ -88,171 +88,3 @@ both our internal login service and CAS.
An advantage of this approach is being able control the authentication process An advantage of this approach is being able control the authentication process
completely instead of just receiving session data through a global variable. completely instead of just receiving session data through a global variable.
## Installation
The web API requires a PHP runtime version at least 7. Which one depends on actual configuration, there is a choice between _mod_php_ inside Apache, _php-fpm_ with Apache or Nginx proxy or running it as standalone uWSGI script. It is common that there are some PHP extensions, that have to be installed on the system. Namely ZeroMQ binding (`php-zmq` package or similar), MySQL module (`php-mysqlnd` package) and ldap extension module for CAS authentication (`php-ldap` package). Make sure that the extensions are loaded in your `php.ini` file (`/etc/php.ini` or files in `/etc/php.d/`).
The API depends on some other projects and libraries. For managing them [Composer](https://getcomposer.org/) is used. It can be installed from system repositories or downloaded from the website, where detailed instructions are as well. Composer reads `composer.json` file in the project root and installs dependencies to the `vendor/` subdirectory. To do that, run:
```
$ composer install
```
## Configuration and usage
The API can be configured in `config.neon` and `config.local.neon` files in `app/config` directory. The first file is predefined by authors and should not be modified. The second one is not present and could be created by copying `config.local.neon.example` template in the config directory. Local configuration have higher precedence, so it will override default values from `config.neon`.
### Configurable items
Description of configurable items. All timeouts are in milliseconds if not stated otherwise.
- accessManager -- configuration of access token in [JWT standard](https://www.rfc-editor.org/rfc/rfc7519.txt). Do **not** modify unless you really know what are you doing.
- fileServer -- connection to fileserver
- address -- URI of fileserver
- auth -- _username_ and _password_ for HTTP basic authentication
- timeouts -- _connection_ timeout for establishing new connection and _request_ timeout for completing one request
- broker -- connection to broker
- address -- URI of broker
- auth -- _username_ and _password_ for broker callback authentication back to API
- timeouts -- _ack_ timeout for first response that broker receives the message, _send_ timeout how long try to send new job to the broker and _result_ timeout how long to wait for confirmation if job can be processed or not
- monitor -- connection to monitor
- address -- URI of monitor
- CAS -- CAS external authentication
- serviceId -- visible identifier of this service
- ldapConnection -- parameters for connecting to LDAP, _hostname_, _base_dn_, _port_, _security_ and _bindName_
- fields -- names of LDAP keys for informations as _email_, _firstName_ and _lastName_
- emails -- common configuration for sending email (addresses and template variables)
- apiUrl -- base URL of API server including port (for referencing pictures in messages)
- footerUrl -- link in the message footer
- siteName -- name of frontend (ReCodEx, or KSP for unique instance for KSP course)
- githubUrl -- URL to GitHub repository of this project
- from -- sending email address
- failures -- admin messages on errors
- emails -- additional info for sending mails, _to_ is admin mail address, _from_ is source address, _subjectPrefix_ is prefix of mail subject
- forgottenPassword -- user messages for changing passwords
- redirectUrl -- URL of web application where the password can be changed
- tokenExpiration -- expiration timeout of temporary token (in seconds)
- emails -- additional info for sending mails, _from_ is source address and _subjectPrefix_ is prefix of mail subject
- mail -- configuration of sending mails
- smtp -- using SMTP server, have to be "true"
- host -- address of the server
- port -- sending port (common values are 25, 465, 587)
- username -- login to the server
- password -- password to the server
- secure -- security, values are empty for no security, "ssl" or "tls"
- context -- additional parameters, depending on used mail engine. For examle self-signed certificates can be allowed as _verify_peer_ and _verify_peer_name_ to false and _allow_self_signed_ to true under _ssl_ key (see example).
Outside the parameters section of configuration is configuration for Doctrine. It is ORM framework which maps PHP objects (entities) into database tables and rows. The configuration is simple, required items are only _user_, _password_ and _host_ with _dbname_, i.e. address of database computer (mostly localhost) with name of ReCodEx database.
### Example local configuration file
```{.yml}
parameters:
accessManager:
leeway: 60
issuer: https://recodex.projekty.ms.mff.cuni.cz
audience: https://recodex.projekty.ms.mff.cuni.cz
expiration: 86400 # 24 hours in seconds
usedAlgorithm: HS256
allowedAlgorithms:
- HS256
verificationKey: "recodex-123"
fileServer:
address: http://127.0.0.1:9999
auth:
username: "user"
password: "pass"
timeouts:
connection: 500
broker:
address: tcp://127.0.0.1:9658
auth:
username: "user"
password: "pass"
timeouts:
ack: 100
send: 5000
result: 1000
monitor:
address: wss://recodex.projekty.ms.mff.cuni.cz:4443/ws
CAS:
serviceId: "cas-uk"
ldapConnection:
hostname: "ldap.cuni.cz"
base_dn: "ou=people,dc=cuni,dc=cz"
port: 389
security: SSL
bindName: "cunipersonalid"
fields:
email: "mail"
firstName: "givenName"
lastName: "sn"
emails:
apiUrl: https://recodex.projekty.ms.mff.cuni.cz:4000
footerUrl: https://recodex.projekty.ms.mff.cuni.cz
siteName: "ReCodEx"
githubUrl: https://github.com/ReCodEx
from: "ReCodEx <noreply@example.com>"
failures:
emails:
to: "Admin Name <admin@example.com>"
from: %emails.from%
subjectPrefix: "ReCodEx Failure Report - "
forgottenPassword:
redirectUrl: "https://recodex.projekty.ms.mff.cuni.cz/
forgotten-password/change"
tokenExpiration: 600 # 10 minues
emails:
from: %emails.from%
subjectPrefix: "ReCodEx Forgotten Password Request - "
mail:
smtp: true
host: "smtp.ps.stdin.cz"
port: 587
username: "user"
password: "pass"
secure: "tls"
context:
ssl:
verify_peer: false
verify_peer_name: false
allow_self_signed: true
doctrine:
user: "user"
password: "pass"
host: localhost
dbname: "recodex-api"
```
### Database preparation
When the API is installed and configured (_doctrine_ section is sufficient here) the database schema can be generated. There is a prepared command to do that from command line:
```
$ php www/index.php orm:schema-tool:update --force
```
With API comes some initial values, for example default user roles with proper permissions. To fill your database with these values there is another command line command:
```
$ php www/index.php db:fill
```
Check the outputs of both commands for errors. If there are any, try to clean temporary API cache in `temp/cache/` directory and repeat the action.
### Webserver configuration
The simplest way to get started is to start the built-in PHP server in the root directory of your project:
```
$ php -S localhost:4000 -t www
```
Then visit `http://localhost:4000` in your browser to see the welcome page of API project.
For Apache or Nginx, setup a virtual host to point to the `www/` directory of the project and you should be ready to go. It is **critical** that whole `app/`, `log/` and `temp/` directories are not accessible directly via a web browser (see [security warning](https://nette.org/security-warning)). Also it is **highly recommended** to set up a HTTPS certificate for public access to the API.
### Troubleshooting
In case of any issues first remove the Nette cache directory `temp/cache/` and try again. This solves most of the errors. If it does not help, examine API logs from `log/` directory of the API source or logs of your webserver.

@ -153,61 +153,3 @@ $ npm run exportStrings
``` ```
This will create *JSON* files with the exported strings for the *'en'* and *'cs'* locale. If you want to export strings for more languages, you must edit the `/manageTranslations.js` script. The exported strings are placed in the `/src/locales` directory. This will create *JSON* files with the exported strings for the *'en'* and *'cs'* locale. If you want to export strings for more languages, you must edit the `/manageTranslations.js` script. The exported strings are placed in the `/src/locales` directory.
## Installation
Web application requires [NodeJS](https://nodejs.org/en/) server as its runtime environment. This runtime is needed for executing JavaScript code on server and sending the pre-render parts of pages to clients, so the final rendering in browsers is a lot quicker and the page is accessible to search engines for indexing.
But some functionality is better in other full fledged web servers like *Apache* or *Nginx*, so the common practice is to use a tandem of both. *NodeJS* takes care of basic functionality of the app while the other server (Apache) is set as reverse proxy and providing additional functionality like SSL encryption, load balancing or caching of static files. The recommended setup contains both NodeJS and one of Apache and Nginx web servers for the reasons discussed above.
Stable versions of 4th and 6th series of NodeJS server are sufficient, using at least 6th series is highly recommended. Please check the most recent version of the packages in your distribution's repositories, there are often outdated ones. However, there are some third party repositories for all main Linux distributions.
The app depends on several libraries and components, all of them are listed in `package.json` file in source repository. For managing dependencies is used node package manager (`npm`), which can come with NodeJS installation otherwise can be installed separately. To fetch and install all dependencies run:
```
$ npm install
```
For easy production usage there is an additional package for managing NodeJS processes, `pm2`. This tool can run your application as a daemon, monitor occupied resources, gather logs and provide simple console interface for managing app's state. To install it globally into your system run:
```
# npm install pm2 -g
```
## Configuration and usage
The application can be run in two modes, development and production. Development mode uses only client rendering and tracks code changes with rebuilds of the application in real time. In production mode the compilation (transpile to _ES5_ standard using *Babel* and bundle into single file using *webpack*) has to be done separately prior to running. The scripts for compilation are provided as additional `npm` commands.
- Development mode can be use for local testing of the app. This mode uses webpack dev server, so all code runs on a client, there is no server side rendering available. Starting is simple command, default address is http://localhost:8080.
```
$ npm run dev
```
- Production mode is mostly used on the servers. It provides all features such as server side rendering. This can be run via:
```
$ npm run build
$ npm start
```
Both modes can be configured to use different ports or set base address of used API server. This can be configured in `.env` file in root of the repository. There is `.env-sample` file which can be just copied and altered.
The production mode can be run also as a demon controled by `pm2` tool. First the web application has to be built and then the server javascript file can run as a daemon.
```
$ npm run build
$ pm2 start bin/server.js
```
The `pm2` tool has several options, most notably _status_, _stop_, _restart_ and _logs_. Further description is available on project [website](http://pm2.keymetrics.io).
#### Configurable items
Description of configurable options. Bold are required values, optional ones are in italics.
- **NODE_ENV** -- mode of the server
- **API_BASE** -- base address of API server, including port and API version
- **PORT** -- port where the app is listening
- _WEBPACK_DEV_SERVER_PORT_ -- port for webpack dev server when running in development mode. Default one is 8081, this option might be useful when this port is necessary for some other service.
#### Example configuration file
```
NODE_ENV=production
API_BASE=https://recodex.projekty.ms.mff.cuni.cz:4000/v1
PORT=8080
```

@ -82,245 +82,6 @@ Isolate is executed in separate Linux process created by `fork` and `exec` syste
Sandbox in general has to be command line application taking parameters with arguments, standard input or file. Outputs should be written to file or standard output. There are no other requirements, worker design is very versatile and can be adapted to different needs. Sandbox in general has to be command line application taking parameters with arguments, standard input or file. Outputs should be written to file or standard output. There are no other requirements, worker design is very versatile and can be adapted to different needs.
## Installation
### Dependencies
Worker specific requirements are written in this section. It covers only basic requirements, additional runtimes or tools may be needed depending on type of use. The package names are for CentOS if not specified otherwise.
- ZeroMQ in version at least 4.0, packages `zeromq` and `zeromq-devel` (`libzmq3-dev` on Debian)
- YAML-CPP library, `yaml-cpp` and `yaml-cpp-devel` (`libyaml-cpp0.5v5` and `libyaml-cpp-dev` on Debian)
- libcurl library `libcurl-devel` (`libcurl4-gnutls-dev` on Debian)
- libarchive library as optional dependency. Installing will speed up build process, otherwise libarchive is built from source during installation. Package name is `libarchive` and `libarchive-devel` (`libarchive-dev` on Debian)
**Install Isolate from source**
First, we need to compile sandbox Isolate from source and install it. Current worker is tested against version 1.3, so this version needs to be checked out. Assume that we keep source code in `/opt/src` dir. For building man page you need to have package `asciidoc` installed.
```
$ cd /opt/src
$ git clone https://github.com/ioi/isolate.git
$ cd isolate
$ git checkout v1.3
$ make
# make install && make install-doc
```
For proper work Isolate depends on several advanced features of the Linux kernel. Make sure that your kernel is compiled with `CONFIG_PID_NS`, `CONFIG_IPC_NS`, `CONFIG_NET_NS`, `CONFIG_CPUSETS`, `CONFIG_CGROUP_CPUACCT`, `CONFIG_MEMCG`. If your machine has swap enabled, also check `CONFIG_MEMCG_SWAP`. With which flags was your kernel compiled with can be found in `/boot` directory, file `config-` and version of your kernel. Red Hat based distributions should have these enabled by default, for Debian you you may want to add the parameters `cgroup_enable=memory swapaccount=1` to the kernel command-line, which can be set by adding value `GRUB_CMDLINE_LINUX_DEFAULT` to `/etc/default/grub` file.
For better reproducibility of results, some kernel parameters can be tweaked:
- Disable address space randomization. Create file `/etc/sysctl.d/10-recodex.conf` with content `kernel.randomize_va_space=0`. Changes will take effect after restart or run `sysctl kernel.randomize_va_space=0` command.
- Disable dynamic CPU frequency scaling. This requires setting the cpufreq scaling governor to _performance_.
### Clone worker source code repository
```
$ git clone https://github.com/ReCodEx/worker.git
$ git submodule update --init
```
### Install worker on Linux
It is supposed that your current working directory is that one with clonned worker source codes.
- Prepare environment running `mkdir build && cd build`
- Build sources by `cmake ..` following by `make`
- Build binary package by `make package` (may require root permissions).
Note that `rpm` and `deb` packages are build in the same time. You may need to have `rpmbuild` command (usually as `rpmbuild` or `rpm` package) or edit CPACK_GENERATOR variable in _CMakeLists.txt_ file in root of source code tree.
- Install generated package through your package manager (`yum`, `dnf`, `dpkg`).
The worker installation process is composed of following steps:
- create config file `/etc/recodex/worker/config-1.yml`
- create systemd unit file `/etc/systemd/system/recodex-worker@.service`
- put main binary to `/usr/bin/recodex-worker`
- put judges binaries to `/usr/bin/` directory
- create system user and group `recodex` with `/sbin/nologin` shell (if not already existing)
- create log directory `/var/log/recodex`
- set ownership of config (`/etc/recodex`) and log (`/var/log/recodex`) directories to `recodex` user and group
_Note:_ If you do not want to generate binary packages, you can just install the project with `make install` (as root). But installation through your distribution's package manager is preferred way to keep your system clean and manageable in long term horizon.
### Install worker on Windows
From beginning we are determined to support Windows operating system on which some of the workers may run (especially for projects in C# programming language). Support for Windows is quite hard and time consuming and there were several problems during the development. To ensure capability of compilation on Windows we set up CI for Windows named [Appveyor](http://www.appveyor.com/). However installation should be easy due to provided installation script.
There are only two additional dependencies needed, **Windows 7 and higher** and **Visual Studio 2015+**. Provided simple installation batch script should do all the work on Windows machine. Officially only VS2015 and 32-bit compilation is supported, because of hardcoded compile options in installation script. If different VS or different platform is needed, the script should be changed to appropriate values, which is simple and straightforward.
Mentioned script is placed in *install* directory alongside supportive scripts for UNIX systems and is named *win-build.cmd*. Provided script will do almost all the work connected with building and dependency resolving (using **NuGet** package manager and `msbuild` building system). Script should be run under 32-bit version of _Developer Command Prompt for VS2015_ and from *install* directory.
Building and installing of worker is then quite simple, script has command line parameters which can be used to specify what will be done:
- *-build* -- It is the default options if none specified. Builds worker and its tests, all is saved in *build* folder and subfolders.
- *-clean* -- Cleanup of downloaded NuGet packages and built application/libraries.
- *-test* -- Build worker and run tests on compiled test cases.
- *-package* -- Generation of clickable installation using cpack and [NSIS](http://nsis.sourceforge.net/) (has to be installed on machine to get this to work).
```
install> win-build.cmd # same as: win-build.cmd -build
install> win-build.cmd -clean
install> win-build.cmd -test
install> win-build.cmd -package
```
All build binaries and cmake temporary files can be found in *build* folder,
classically there will be subfolder *Release* which will contain compiled
application with all needed dlls. Once if clickable installation binary is
created, it can be found in *build* folder under name
*recodex-worker-VERSION-win32.exe*. Sample screenshot can be found on following picture.
![NSIS Installation](https://github.com/ReCodEx/wiki/blob/master/images/nsis_installation.png)
## Configuration and usage
Following text describes how to set up and run **worker** program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd nowadays.
### Default worker configuration
Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml with similar structure to job configuration.
#### Configuration items
Mandatory items are bold, optional italic.
- **worker-id** -- unique identification of worker at one server. This id is used by _isolate_ sanbox on linux systems, so make sure to meet isolate's requirements (default is number from 1 to 999).
- _worker-description_ -- human readable description of this worker
- **broker-uri** -- URI of the broker (hostname, IP address, including port, ...)
- _broker-ping-interval_ -- time interval how often to send ping messages to broker. Used units are milliseconds.
- _max-broker-liveness_ -- specifies how many pings in a row can broker miss without making the worker dead.
- _headers_ -- map of headers specifies worker's capabilities
- _env_ -- list of enviromental variables which are sent to broker in init command
- _threads_ -- information about available threads for this worker
- **hwgroup** -- hardware group of this worker. Hardware group must specify worker hardware and software capabilities and it is main item for broker routing decisions.
- _working-directory_ -- where will be stored all needed files. Can be the same for multiple workers on one server.
- **file-managers** -- addresses and credentials to all file managers used (eq. all different frontends using this worker)
- **hostname** -- URI of file manager
- _username_ -- username for http authentication (if needed)
- _password_ -- password for http authentication (if needed)
- _file-cache_ -- configuration of caching feature
- _cache-dir_ -- path to caching directory. Can be the same for multiple workers.
- _logger_ -- settings of logging capabilities
- _file_ -- path to the logging file with name without suffix. `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, ...
- _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug`
- _max-size_ -- maximal size of log file before rotating
- _rotations_ -- number of rotation kept
- _limits_ -- default sandbox limits for this worker. All items are described in assignments section in job configuration description. If some limits are not set in job configuration, defaults from worker config will be used. In such case the worker's defaults will be set as the maximum for the job. Also, limits in job configuration cannot exceed limits from worker.
#### Example config file
```{.yml}
worker-id: 1
broker-uri: tcp://localhost:9657
broker-ping-interval: 10 # milliseconds
max-broker-liveness: 10
headers:
env:
- c
- cpp
threads: 2
hwgroup: "group1"
working-directory: /tmp/recodex
file-managers:
- hostname: "http://localhost:9999" # port is optional
username: "" # can be ignored in specific modules
password: "" # can be ignored in specific modules
file-cache: # only in case that there is cache module
cache-dir: "/tmp/recodex/cache"
logger:
file: "/var/log/recodex/worker" # w/o suffix - actual names will
# be worker.log, worker.1.log,...
level: "debug" # level of logging
max-size: 1048576 # 1 MB; max size of file before log rotation
rotations: 3 # number of rotations kept
limits:
time: 5 # in secs
wall-time: 6 # seconds
extra-time: 2 # seconds
stack-size: 0 # normal in KB, but 0 means no special limit
memory: 50000 # in KB
parallel: 1
disk-size: 50
disk-files: 5
environ-variable:
ISOLATE_BOX: "/box"
ISOLATE_TMP: "/tmp"
bound-directories:
- src: /tmp/recodex/eval_5
dst: /evaluate
mode: RW,NOEXEC
```
### Running the worker
A systemd unit file is distributed with the worker to simplify its launch. It
integrates worker nicely into your Linux system and allows you to run it
automatically on system startup. It is possible to have more than one worker on
every server, so the provided unit file is templated. Each instance of the
worker unit has a unique string identifier, which is used for managing that
instance through systemd. By default, only one worker instance is ready to use
after installation and its ID is "1".
- Starting worker with id "1" can be done this way:
```
# systemctl start recodex-worker@1.service
```
Check with
```
# systemctl status recodex-worker@1.service
```
if the worker is running. You should see "active (running)" message.
- Worker can be stopped or restarted accordigly using `systemctl stop` and `systemctl restart` commands.
- If you want to run worker after system startup, run:
```
# systemctl enable recodex-worker@1.service
```
For further information about using systemd please refer to systemd documentation.
### Adding new worker
To add a new worker you need to do a few steps:
- Make up an unique string ID.
- Copy default configuration file `/etc/recodex/worker/config-1.yml` to the same directory and name it `config-<your_unique_ID>.yml`
- Edit that config file to fit your needs. Note that you must at least change _worker-id_ and _logger file_ values to be unique.
- Run new instance using
```
# systemctl start recodex-worker@<your_unique_ID>.service
```
## Sandboxes
### Isolate
Isolate is used as one and only sandbox for linux-based operating systems. Headquarters of this project can be found at [GitHub](https://github.com/ioi/isolate) and more of its installation and setup can be found in [installation](#installation) section. Isolate uses linux kernel features for sandboxing and thus its security depends on them, namely _kernel namespaces_ and _cgroups_ are used. Similar functionality can now be partially achieved with systemd.
From the very beginning of ReCodEx project there was sure that Isolate sandbox for Linux environment will be used. There is no suitable general purpose sandbox on Windows platform, so main operation system of whole backend should be linux-based. Set of supported operations in Isolate seems reasonable for every sandbox, so most of its functionality is accessible from job configuration. As there is no other sandbox, naming often reflects Isolate's names. However worker is prepared to run on Windows too, so integrating with other sandboxes (as libraries or commandline tools) is possible.
Isolate as sandbox provides wide scale of functionality which can be used to limit resources or even cut off particular resources from sandboxed program. There is of course basics like limiting cpu-time and memory consumption, but there can be found also wall-time (human perception of time) or extra-time which is extra limit added to other time limits to increase chance of successful exiting of sandboxed program. From other features there is limiting stack-size, redirection of stdin, stdout or stderr from/to a file. Worth of mentioning is also defining number of processes/threads which can be created or defining environment variables which are passed to sandboxed program.
Chapter by itself is filesystem handling. Isolate uses mount kernel namespace to create "virtual" filesystem which will be mounted in sandboxed program. By default there are only few read-only files/directories mapped into sandbox (described in Isolate man-page). This can be of course changed by providing another numerous folders as isolate parameters. By default folders are mapped as read-only but Isolate has few access options which can be set to some mount point.
#### Limit isolate boxes to particular cpu or memory node
New feature in version 1.3 is possibility of limit Isolate box to one or more cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and `cpuset.mems` which should be just fine for sandbox purposes. As kernel functionality further description can be found in manual page of _cpuset_ or in Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As previously stated this settings can be applied for particular isolate boxes and has to be written in isolate configuration. Standard configuration path should be `/usr/local/etc/isolate` but it may depend on your installation process. Configuration of _cpuset_ in there is really simple and is described in example below.
```
box0.cpus = 0 # assign processor with ID 0 to isolate box with ID 0
box0.mems = 0 # assign memory node with ID 0
# if not set, linux by itself will decide where should
# the sandboxed programs run at
box2.cpus = 1-3 # assign range of processors to isolate box 2
box2.mems = 4-7 # assign range of memory nodes
box3.cpus = 1,2,3 # assign list of processors to isolate box 3
```
- **cpuset.cpus:** Cpus limitation will restrict sandboxed program only to processor threads set in configuration. On hyperthreaded processors this means that all virtual threads are assignable, not only the physical ones. Value can be represented by single number, list of numbers separated by commas or range with hyphen delimiter.
- **cpuset.mems:** This value is particularly handy on NUMA systems which has several memory nodes. On standard desktop computers this value should always be zero because only one independent memory node is present. As stated in `cpus` limitation there can be single value, list of values separated by comma or range stated with hyphen.
### WrapSharp
WrapSharp is sandbox for programs in C# written also in C#. We have written it as a proof of concept sandbox for using in Windows environment. However, it is not properly tested and integrated to the worker yet. Security audit should be done before using in production. After that, with just a little bit of effort integrating into worker there can be a running sandbox for C# programs on Windows system.
## Cleaner ## Cleaner
### Description ### Description
@ -335,69 +96,3 @@ There is a bit of catch with cleaner service, to work properly, server filesyste
Another possibility seems to be to update last modified timestamp when accessing the file. This timestamp is used in most major filesystems, so there are less issues with compatibility than last access timestamp. The modified timestamp then must be updated by workers at each access, for example using `touch` command or similar. Final decision on better of these ways will be made after practical experience of running production system. Another possibility seems to be to update last modified timestamp when accessing the file. This timestamp is used in most major filesystems, so there are less issues with compatibility than last access timestamp. The modified timestamp then must be updated by workers at each access, for example using `touch` command or similar. Final decision on better of these ways will be made after practical experience of running production system.
### Installation
To install and use the cleaner, it is necessary to have Python3 with package manager `pip` installed.
- Dependencies of cleaner has to be installed:
```
$ pip install -r requirements.txt
```
- RPM distributions can make and install binary package. This can be done like this:
```
$ python setup.py bdist_rpm --post-install ./cleaner/install/postinst
# yum install ./dist/recodex-cleaner-<version>-1.noarch.rpm
```
- Other Linux distributions can install cleaner straight
```
$ python setup.py install --install-scripts /usr/bin
# ./cleaner/install/postinst
```
- For Windows installation do following:
- start `cmd` with administrator permissions
- run installation with
```
> python setup.py install --install-scripts \
"C:\Program Files\ReCodEx\cleaner"
```
where path specified with `--install-scripts` can be changed
- copy configuration file alongside with installed executable using
```
> copy install\config.yml \
"C:\Program Files\ReCodEx\cleaner\config.yml"
```
### Configuration and usage
#### Configuration items
- **cache-dir** -- directory which cleaner manages
- **file-age** -- file age in seconds which are considered outdated and will be deleted
#### Example configuration
```{.yml}
cache-dir: "/tmp"
file-age: "3600" # in seconds
```
#### Usage
As stated before cleaner should be cronned, on linux systems this can be done by built in `cron` service or if there is `systemd` present cleaner itself provides `*.timer` file which can be used for cronning from `systemd`. On Windows systems internal scheduler should be used.
- Running cleaner from command line is fairly simple:
```
$ recodex-cleaner -c /etc/recodex/cleaner
```
- Enable cleaner service using systemd:
```
$ systemctl start recodex-cleaner.timer
```
- Add cleaner to linux cron service using following configuration line:
```
0 0 * * * /usr/bin/recodex-cleaner -c /etc/recodex/cleaner/config.yml
```
- Add cleaner to Windows cheduler service with following command:
```
> schtasks /create /sc daily /tn "ReCodEx Cleaner" /tr \
"\"C:\Program Files\ReCodEx\cleaner\recodex-cleaner.exe\" \
-c \"C:\Program Files\ReCodEx\cleaner\config.yml\""
```

@ -1,24 +0,0 @@
### [[Home]]
### Content
* [[Introduction]]
* [[User documentation]]
* [[Overall architecture]]
* [[Assignments]]
* [[Submission flow]]
* [[Installation]]
* [[Worker]]
* [[Broker]]
* [[Monitor]]
* [[Fileserver]]
* [[Web API]]
* [[Web application]]
* [[Database]]
* [[Conclusion]]
### Separated pages
* [[FAQ]]
* [[Logo]]
* [[Coding style]]
* [[Database schema]]
Loading…
Cancel
Save