You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

372 lines
23 KiB
Markdown

# Worker
## Description
The worker's job is to securely execute submitted assignments and possibly
evaluate results against model solutions provided by submitter. Worker is logicaly
divided into two parts:
- **Listener** - communicates with broker through
[ZeroMQ](http://zeromq.org/). On startup, it introduces itself to the broker.
Then it receives new jobs, passes them to the **evaluator** part and sends
back results and progress reports.
- **Evaluator** - gets jobs from the **listener** part, evaluates them (possibly
in sandbox) and notifies the other part when the evaluation ends. **Evaluator**
also communicates with fileserver, downloads supplementary files and
uploads detailed results.
These parts run in separate threads of the same process and communicate through a ZeroMQ in-process socket. Alternative approach would be using shared memory region with unique access, but messaging is generally accepted to be safer. Shared memory has to be used very carefully because of race condition issues when reading and writing concurently. Also, messages inside worker are small, so there is no huge overhead copying data between threads. This two threaded design allows the worker to keep sending `ping` messages even when it is processing a job.
After receiving an evaluation request, worker has to:
- download the archive containing submitted source files and configuration file
- download any supplementary files based on the configuration file, such as test
inputs or helper programs (this is done on demand, using a `fetch` command
in the assignment configuration)
- evaluate the submission accordingly to job configuration
- during evaluation progress messages can be sent back to broker
- upload the results of the evaluation to the fileserver
- notify broker that the evaluation finished
### Header matching
Every worker belongs to exactly one **hardware group** and has a set of **headers**.
These properties help the broker decide which worker is suitable for processing
a request.
The hardware group is a string identifier used to group worker machines with
similar hardware configuration, for example "i7-4560-quad-ssd". It is
important for assignments where running times are compared to those of reference
solutions (we have to make sure that both programs run on simmilar hardware).
The headers are a set of key-value pairs that describe the worker
capabilities -- which runtime environments are installed, how many threads can
the worker run or whether it measures time precisely.
These information are sent to the broker on startup using the `init` command.
## Architecture
Picture below is internal architecture of worker which shows its defined classes with private variables and public functions.
[comment]: TODO: FIX THIS: ![Worker class diagram](https://rawgit.com/ReCodEx/wiki/master/images/worker_class_diagram.svg)
## Installation
### Dependencies
Worker specific requirements are written in this section. It covers only basic requirements, additional runtimes or tools may be needed depending on type of use. The package names are for CentOS if not specified otherwise.
- ZeroMQ in version at least 4.0, packages `zeromq` and `zeromq-devel` (`libzmq3-dev` on Debian)
- YAML-CPP library, `yaml-cpp` and `yaml-cpp-devel` (`libyaml-cpp0.5v5` and `libyaml-cpp-dev` on Debian)
- libcurl library `libcurl-devel` (`libcurl4-gnutls-dev` on Debian)
- libarchive library as optional dependency. Installing will speed up build process, otherwise libarchive is built from source during installation. Package name is `libarchive` and `libarchive-devel` (`libarchive-dev` on Debian)
**Install Isolate from source**
First, we need to compile sandbox Isolate from source and install it. Current worker is tested against version 1.3, so this version needs to be checked out. Assume that we keep source code in `/opt/src` dir. For building man page you need to have package `asciidoc` installed.
```
$ cd /opt/src
$ git clone https://github.com/ioi/isolate.git
$ cd isolate
$ git checkout v1.3
$ make
# make install && make install-doc
```
For proper work Isolate depends on several advanced features of the Linux kernel. Make sure that your kernel is compiled with `CONFIG_PID_NS`, `CONFIG_IPC_NS`, `CONFIG_NET_NS`, `CONFIG_CPUSETS`, `CONFIG_CGROUP_CPUACCT`, `CONFIG_MEMCG`. If your machine has swap enabled, also check `CONFIG_MEMCG_SWAP`. With which flags was your kernel compiled with can be found in `/boot` directory, file `config-` and version of your kernel. Red Hat based distributions should have these enabled by default, for Debian you you may want to add the parameters `cgroup_enable=memory swapaccount=1` to the kernel command-line, which can be set by adding value `GRUB_CMDLINE_LINUX_DEFAULT` to `/etc/default/grub` file.
For better reproducibility of results, some kernel parameters can be tweaked:
- Disable address space randomization. Create file `/etc/sysctl.d/10-recodex.conf` with content `kernel.randomize_va_space=0`. Changes will take effect after restart or run `sysctl kernel.randomize_va_space=0` command.
- Disable dynamic CPU frequency scaling. This requires setting the cpufreq scaling governor to _performance_.
### Clone worker source code repository
```
$ git clone https://github.com/ReCodEx/worker.git
$ git submodule update --init
```
### Install worker on Linux
It is supposed that your current working directory is that one with clonned worker source codes.
- Prepare environment running `mkdir build && cd build`
- Build sources by `cmake ..` following by `make`
- Build binary package by `make package` (may require root permissions).
Note that `rpm` and `deb` packages are build in the same time. You may need to have `rpmbuild` command (usually as `rpmbuild` or `rpm` package) or edit CPACK_GENERATOR variable in _CMakeLists.txt_ file in root of source code tree.
- Install generated package through your package manager (`yum`, `dnf`, `dpkg`).
The worker installation process is composed of following steps:
- create config file `/etc/recodex/worker/config-1.yml`
- create systemd unit file `/etc/systemd/system/recodex-worker@.service`
- put main binary to `/usr/bin/recodex-worker`
- put judges binaries to `/usr/bin/recodex-judge-normal`, `/usr/bin/recodex-judge-shuffle` and `/usr/bin/recodex-judge-filter`
- create system user and group `recodex` with `/sbin/nologin` shell (if not already existing)
- create log directory `/var/log/recodex`
- set ownership of config (`/etc/recodex`) and log (`/var/log/recodex`) directories to `recodex` user and group
_Note:_ If you do not want to generate binary packages, you can just install the project with `make install` (as root). But installation through your distribution's package manager is preferred way to keep your system clean and manageable in long term horizon.
### Install worker on Windows
From beginning we are determined to support Windows operating system on which some of the workers may run (especially for projects in C# programming language). Support for Windows is quite hard and time consuming and there were several problems during the development. To ensure capability of compilation on Windows we set up CI for Windows named [Appveyor](http://www.appveyor.com/). However installation should be easy due to provided installation script.
There are only two additional dependencies needed, **Windows 7 and higher** and **Visual Studio 2015+**. Provided simple installation batch script should do all the work on Windows machine. Officially only VS2015 and 32-bit compilation is supported, because of hardcoded compile options in installation script. If different VS or different platform is needed, the script should be changed to appropriate values, which is simple and straightforward.
Mentioned script is placed in *install* directory alongside supportive scripts for UNIX systems and is named *win-build.cmd*. Provided script will do almost all the work connected with building and dependency resolving (using **NuGet** package manager and `msbuild` building system). Script should be run under 32-bit version of _Developer Command Prompt for VS2015_ and from *install* directory.
Building and installing of worker is then quite simple, script has command line parameters which can be used to specify what will be done:
- *-build* -- It is the default options if none specified. Builds worker and its tests, all is saved in *build* folder and subfolders.
- *-clean* -- Cleanup of downloaded NuGet packages and built application/libraries.
- *-test* -- Build worker and run tests on compiled test cases.
- *-package* -- Generation of clickable installation using cpack and [NSIS](http://nsis.sourceforge.net/) (has to be installed on machine to get this to work).
```
install> win-build.cmd # same as: win-build.cmd -build
install> win-build.cmd -clean
install> win-build.cmd -test
install> win-build.cmd -package
```
All build binaries and cmake temporary files can be found in *build* folder,
classically there will be subfolder *Release* which will contain compiled
application with all needed dlls. Once if clickable installation binary is
created, it can be found in *build* folder under name
*recodex-worker-VERSION-win32.exe*. Sample screenshot can be found on following picture.
![NSIS Installation](https://github.com/ReCodEx/wiki/blob/master/images/nsis_installation.png)
## Configuration and usage
Following text describes how to set up and run **worker** program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd nowadays.
### Default worker configuration
Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml with similar structure to job configuration.
#### Configuration items
Mandatory items are bold, optional italic.
- **worker-id** - unique identification of worker at one server. This id is used by _isolate_ sanbox on linux systems, so make sure to meet isolate's requirements (default is number from 1 to 999).
- **broker-uri** - URI of the broker (hostname, IP address, including port, ...)
- _broker-ping-interval_ - time interval how often to send ping messages to broker. Used units are milliseconds.
- _max-broker-liveness_ - specifies how many pings in a row can broker miss without making the worker dead.
- _headers_ - map of headers specifies worker's capabilities
- _env_ - list of enviromental variables which are sent to broker in init command
- _threads_ - information about available threads for this worker
- **hwgroup** - hardware group of this worker. Hardware group must specify worker hardware and software capabilities and it is main item for broker routing decisions.
- _working-directory_ - where will be stored all needed files. Can be the same for multiple workers on one server.
- **file-managers** - addresses and credentials to all file managers used (eq. all different frontends using this worker)
- **hostname** - URI of file manager
- _username_ - username for http authentication (if needed)
- _password_ - password for http authentication (if needed)
- _file-cache_ - configuration of caching feature
- _cache-dir_ - path to caching directory. Can be the same for multiple workers.
- _logger_ - settings of logging capabilities
- _file_ - path to the logging file with name without suffix. `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, ...
- _level_ - level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug`
- _max-size_ - maximal size of log file before rotating
- _rotations_ - number of rotation kept
- _limits_ - default sandbox limits for this worker. All items are described in assignments section in job configuration description. If some limits are not set in job configuration, defaults from worker config will be used. In such case the worker's defaults will be set as the maximum for the job. Also, limits in job configuration cannot exceed limits from worker.
#### Example config file
```{.yml}
worker-id: 1
broker-uri: tcp://localhost:9657
broker-ping-interval: 10 # milliseconds
max-broker-liveness: 10
headers:
env:
- c
- cpp
threads: 2
hwgroup: "group1"
working-directory: /tmp/recodex
file-managers:
- hostname: "http://localhost:9999" # port is optional
username: "" # can be ignored in specific modules
password: "" # can be ignored in specific modules
file-cache: # only in case that there is cache module
cache-dir: "/tmp/recodex/cache"
logger:
file: "/var/log/recodex/worker" # w/o suffix - actual names will
# be worker.log, worker.1.log,...
level: "debug" # level of logging
max-size: 1048576 # 1 MB; max size of file before log rotation
rotations: 3 # number of rotations kept
limits:
time: 5 # in secs
wall-time: 6 # seconds
extra-time: 2 # seconds
stack-size: 0 # normal in KB, but 0 means no special limit
memory: 50000 # in KB
parallel: 1
disk-size: 50
disk-files: 5
environ-variable:
ISOLATE_BOX: "/box"
ISOLATE_TMP: "/tmp"
bound-directories:
- src: /tmp/recodex/eval_5
dst: /evaluate
mode: RW,NOEXEC
```
### Running the worker
A systemd unit file is distributed with the worker to simplify its launch. It
integrates worker nicely into your Linux system and allows you to run it
automatically on system startup. It is possible to have more than one worker on
every server, so the provided unit file is templated. Each instance of the
worker unit has a unique string identifier, which is used for managing that
instance through systemd. By default, only one worker instance is ready to use
after installation and its ID is "1".
- Starting worker with id "1" can be done this way:
```
# systemctl start recodex-worker@1.service
```
Check with
```
# systemctl status recodex-worker@1.service
```
if the worker is running. You should see "active (running)" message.
- Worker can be stopped or restarted accordigly using `systemctl stop` and `systemctl restart` commands.
- If you want to run worker after system startup, run:
```
# systemctl enable recodex-worker@1.service
```
For further information about using systemd please refer to systemd documentation.
### Adding new worker
To add a new worker you need to do a few steps:
- Make up an unique string ID.
- Copy default configuration file `/etc/recodex/worker/config-1.yml` to the same directory and name it `config-<your_unique_ID>.yml`
- Edit that config file to fit your needs. Note that you must at least change _worker-id_ and _logger file_ values to be unique.
- Run new instance using
```
# systemctl start recodex-worker@<your_unique_ID>.service
```
## Sandboxes
### Isolate
Isolate is used as one and only sandbox for linux-based operating systems. Headquarters of this project can be found at [GitHub](https://github.com/ioi/isolate) and more of its installation and setup can be found in [installation](#installation) section. Isolate uses linux kernel features for sandboxing and thus its security depends on them, namely _kernel namespaces_ and _cgroups_ are used. Similar functionality can now be partially achieved with systemd.
From the very beginning of ReCodEx project there was sure that Isolate sandbox for Linux environment will be used. There is no suitable general purpose sandbox on Windows platform, so main operation system of whole backend should be linux-based. Set of supported operations in Isolate seems reasonable for every sandbox, so most of its functionality is accessible from job configuration. As there is no other sandbox, naming often reflects Isolate's names. However worker is prepared to run on Windows too, so integrating with other sandboxes (as libraries or commandline tools) is possible.
Isolate as sandbox provides wide scale of functionality which can be used to limit resources or even cut off particular resources from sandboxed program. There is of course basics like limiting cpu-time and memory consumption, but there can be found also wall-time (human perception of time) or extra-time which is extra limit added to other time limits to increase chance of successful exiting of sandboxed program. From other features there is limiting stack-size, redirection of stdin, stdout or stderr from/to a file. Worth of mentioning is also defining number of processes/threads which can be created or defining environment variables which are passed to sandboxed program.
Chapter by itself is filesystem handling. Isolate uses mount kernel namespace to create "virtual" filesystem which will be mounted in sandboxed program. By default there are only few read-only files/directories mapped into sandbox (described in Isolate man-page). This can be of course changed by providing another numerous folders as isolate parameters. By default folders are mapped as read-only but Isolate has few access options which can be set to some mount point.
#### Limit isolate boxes to particular cpu or memory node
New feature in version 1.3 is possibility of limit Isolate box to one or more cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and `cpuset.mems` which should be just fine for sandbox purposes. As kernel functionality further description can be found in manual page of _cpuset_ or in Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As previously stated this settings can be applied for particular isolate boxes and has to be written in isolate configuration. Standard configuration path should be `/usr/local/etc/isolate` but it may depend on your installation process. Configuration of _cpuset_ in there is really simple and is described in example below.
```
box0.cpus = 0 # assign processor with ID 0 to isolate box with ID 0
box0.mems = 0 # assign memory node with ID 0
# if not set, linux by itself will decide where should
# the sandboxed programs run at
box2.cpus = 1-3 # assign range of processors to isolate box 2
box2.mems = 4-7 # assign range of memory nodes
box3.cpus = 1,2,3 # assign list of processors to isolate box 3
```
- **cpuset.cpus:** Cpus limitation will restrict sandboxed program only to processor threads set in configuration. On hyperthreaded processors this means that all virtual threads are assignable, not only the physical ones. Value can be represented by single number, list of numbers separated by commas or range with hyphen delimiter.
- **cpuset.mems:** This value is particularly handy on NUMA systems which has several memory nodes. On standard desktop computers this value should always be zero because only one independent memory node is present. As stated in `cpus` limitation there can be single value, list of values separated by comma or range stated with hyphen.
### WrapSharp
WrapSharp is sandbox for programs in C# written also in C#. We have written it as a proof of concept sandbox for using in Windows environment. However, it is not properly tested and integrated to the worker yet. Security audit should be done before using in production. After that, with just a little bit of effort integrating into worker there can be a running sandbox for C# programs on Windows system.
## Cleaner
### Description
Cleaner is integral part of worker which manages its cache folder, mainly deletes outdated files. Every cleaner instance maintains one cache folder, which can be used by multiple workers. This means on one server there can be numerous instances of workers with the same cache folder, but there should be only one cleaner.
Cleaner is written in Python programming language and is used as simple script which just does its job and ends, so has to be cronned. For proper function of cleaner some suitable cronning interval has to be used. It is recommended to use 24 hour interval which should be sufficient enough.
#### Last access timestamp
There is a bit of catch with cleaner service, to work properly, server filesystem has to have enabled last access timestamp. Cleaner checks these stamps and based on them it decides if file will be deleted or not, simple write timestamp or created at timestamp are not enough to reflect real usage and need of particular file. Last access timestamp feature is a bit controversial (more on this subject can be found [here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)) and it is not by default enabled on conventional filesystems. In linux this can be solved by adding `strictatime` option to `fstab` file. On Windows following command has to be executed (as administrator) `fsutil behavior set disablelastaccess 0`.
Another possibility seems to be to update last modified timestamp when accessing the file. This timestamp is used in most major filesystems, so there are less issues with compatibility than last access timestamp. The modified timestamp then must be updated by workers at each access, for example using `touch` command or similar. Final decision on better of these ways will be made after practical experience of running production system.
### Installation
To install and use the cleaner, it is necessary to have Python3 with package manager `pip` installed.
- Dependencies of cleaner has to be installed:
```
$ pip install -r requirements.txt
```
- RPM distributions can make and install binary package. This can be done like this:
```
$ python setup.py bdist_rpm --post-install ./cleaner/install/postinst
# yum install ./dist/recodex-cleaner-<version>-1.noarch.rpm
```
- Other Linux distributions can install cleaner straight
```
$ python setup.py install --install-scripts /usr/bin
# ./cleaner/install/postinst
```
- For Windows installation do following:
- start `cmd` with administrator permissions
- run installation with
```
> python setup.py install --install-scripts \
"C:\Program Files\ReCodEx\cleaner"
```
where path specified with `--install-scripts` can be changed
- copy configuration file alongside with installed executable using
```
> copy install\config.yml \
"C:\Program Files\ReCodEx\cleaner\config.yml"
```
### Configuration and usage
#### Configuration items
- **cache-dir** -- directory which cleaner manages
- **file-age** -- file age in seconds which are considered outdated and will be deleted
#### Example configuration
```{.yml}
cache-dir: "/tmp"
file-age: "3600" # in seconds
```
#### Usage
As stated before cleaner should be cronned, on linux systems this can be done by built in `cron` service or if there is `systemd` present cleaner itself provides `*.timer` file which can be used for cronning from `systemd`. On Windows systems internal scheduler should be used.
- Running cleaner from command line is fairly simple:
```
$ recodex-cleaner -c /etc/recodex/cleaner
```
- Enable cleaner service using systemd:
```
$ systemctl start recodex-cleaner.timer
```
- Add cleaner to linux cron service using following configuration line:
```
0 0 * * * /usr/bin/recodex-cleaner -c /etc/recodex/cleaner/config.yml
```
- Add cleaner to Windows cheduler service with following command:
```
> schtasks /create /sc daily /tn "ReCodEx Cleaner" /tr \
"\"C:\Program Files\ReCodEx\cleaner\recodex-cleaner.exe\" \
-c \"C:\Program Files\ReCodEx\cleaner\config.yml\""
```