You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
143 lines
7.7 KiB
Markdown
143 lines
7.7 KiB
Markdown
# Worker
|
|
|
|
## Description
|
|
**Worker's** main role is securely execute given submission and possibly _evaluate_ results against model solutions provided by submitter. **Worker** is logicaly divided into two parts:
|
|
- **Listener** - listens and communicates with **Broker** through [ZeroMQ](http://zeromq.org/). It receives new jobs, communicates with **Evaluator** part and sends back results or progress.
|
|
- **Evaluator** - gets jobs to evaluate from **Listener** part, evaluate them (possibly in sandbox) and get to know to other part that evaluation ended. This part also communicates with **Fileserver**, downloads needed files and uploads detailed results.
|
|
|
|
**Worker** after getting evaluation request has to:
|
|
|
|
- Download the archive containing submitted source files and configuration file
|
|
- Download any supplementary files based on the configuration file, such as test
|
|
inputs or helper programs (This is done on demand, using a `fetch` command
|
|
in the assignment configuration)
|
|
- Evaluate the submission accordingly to job configuration
|
|
- During evaluation progress states can be sent back to **Broker**
|
|
- Upload the results of the evaluation to the **Fileserver**
|
|
- Notify **Broker** that the evaluation finished
|
|
|
|
## Architecture
|
|
Picture below is overall internal architecture of worker which shows its defined classes with private variables and public functions. Vector version of this picture is available [here](https://github.com/ReCodEx/GlobalWiki/raw/master/images/Worker_Internal_Architecture.pdf).
|
|
![Internal Worker architecture](https://github.com/ReCodEx/GlobalWiki/blob/master/images/Worker_Internal_Architecture.png)
|
|
|
|
## Installation
|
|
|
|
## Configuration and usage
|
|
Following text describes how to set up and run **worker** program. It's supposed to have required binaries installed. For instructions see [[Installation|Worker#installation]] section. Also, using systemd is recommended for best user experience, but it's not required. Almost all modern Linux distributions are using systemd now.
|
|
|
|
Installation of **worker** program does following step to your computer:
|
|
- create config file `/etc/recodex/worker/config-1.yml`
|
|
- create _systemd_ unit file `/etc/systemd/system/recodex-worker@.service`
|
|
- put main binary to `/usr/bin/recodex-worker`
|
|
- put judges binaries to `/usr/bin/recodex-judge-normal`, `/usr/bin/recodex-judge-shuffle` and `/usr/bin/recodex-judge-filter`
|
|
- create system user and group `recodex` with `/sbin/nologin` shell (if not existing)
|
|
- create log directory `/var/log/recodex`
|
|
- set ownership of config (`/etc/recodex`) and log (`/var/log/recodex`) directories to `recodex` user and group
|
|
|
|
### Default worker configuration
|
|
|
|
Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml like in the job config.
|
|
|
|
#### Configuration items
|
|
|
|
Mandatory items are bold, optional italic.
|
|
|
|
- **worker-id** - unique identification of worker at one server. This id is used by _isolate_ sanbox on linux systems, so make sure to meet isolates requirements (default is number from 1 to 999).
|
|
- **broker-uri** - URI of the broker (hostname, IP address, including port, ...)
|
|
- _broker-ping-interval_ - time interval how often to send ping messages to broker. Used units are milliseconds.
|
|
- _max-broker-liveness_ - specifies how many pings in a row can broker miss without making the worker dead.
|
|
- _headers_ - headers specifies worker's capabilities
|
|
- _env_ - map of enviromental variables
|
|
- _threads_ - information about available threads for this worker
|
|
- **hwgroup** - hardware group of this worker. Hardware group must specify worker hardware and software capabilities and it's main item for broker routing decisions.
|
|
- _working-directory_ - where will be stored all needed files. Can be the same for multiple workers on one server.
|
|
- **file-managers** - addresses and credentials to all file managers used (eq. all different frontends using this worker)
|
|
- **hostname** - URI of file manager
|
|
- _username_ - username for http authentication (if needed)
|
|
- _password_ - password for http authentication (if needed)
|
|
- _file-cache_ - configuration of caching feature
|
|
- _cache-dir_ - path to caching directory. Can be the same for mutltiple workers.
|
|
- _logger_ - settings of logging capabilities
|
|
- _file_ - path to the logging file with name without suffix. `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, ...
|
|
- _level_ - level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug`
|
|
- _max-size_ - maximal size of log file before rotating
|
|
- _rotations_ - number of rotation kept
|
|
- _limits_ - default sandbox limits for this worker. All items are described on [assignments page](https://github.com/ReCodEx/GlobalWiki/wiki/Assignments-overview#configuration-items).
|
|
|
|
#### Example config file
|
|
|
|
```{.yml}
|
|
worker-id: 1
|
|
broker-uri: tcp://localhost:9657
|
|
broker-ping-interval: 10 # milliseconds
|
|
max-broker-liveness: 10
|
|
headers:
|
|
env:
|
|
- c
|
|
- python
|
|
threads: 2
|
|
hwgroup: "group1"
|
|
working-directory: /tmp/recodex
|
|
file-managers:
|
|
- hostname: "http://localhost:9999" # port is optional
|
|
username: "" # can be ignored in specific modules
|
|
password: "" # can be ignored in specific modules
|
|
file-cache: # only in case that there is cache module
|
|
cache-dir: "/tmp/recodex/cache"
|
|
logger:
|
|
file: "/var/log/recodex/worker" # w/o suffix - actual names will be worker.log, worker.1.log, ...
|
|
level: "debug" # level of logging
|
|
max-size: 1048576 # 1 MB; max size of file before log rotation
|
|
rotations: 3 # number of rotations kept
|
|
limits:
|
|
time: 5 # in secs
|
|
wall-time: 6 # seconds
|
|
extra-time: 2 # seconds
|
|
stack-size: 0 # normal in KB, but 0 means no special limit
|
|
memory: 50000 # in KB
|
|
parallel: 1 # time and memory limits are merged
|
|
disk-size: 50
|
|
disk-files: 5
|
|
environ-variable:
|
|
ISOLATE_BOX: "/box"
|
|
ISOLATE_TMP: "/tmp"
|
|
bound-directories:
|
|
- src: /tmp/recodex/eval_5
|
|
dst: /evaluate
|
|
mode: RW,NOEXEC
|
|
```
|
|
|
|
### Running worker
|
|
|
|
For easy and comfortable managing worker is included systemd unit file. It integrates worker nicely into your Linux system and allow you to run worker automaticaly after system startup for example. It's supposed to have more than one worker on every server, so provided unit file is templated. Each instance of worker have unique string identifier, which is used for managing that instance through systemd. By default, only one worker instance is ready to use after installation. It's ID is "1".
|
|
|
|
- Starting worker with id "1" can be done this way:
|
|
```
|
|
# systemctl start recodex-worker@1.service
|
|
```
|
|
Check with
|
|
```
|
|
# systemctl status recodex-worker@1.service
|
|
```
|
|
if the worker is running. You should see "active (running)" message.
|
|
|
|
- Worker can be stopped or restarted accordigly using `systemctl stop` and `systemctl restart` commands.
|
|
- If you want to run worker after system startup, run:
|
|
```
|
|
# systemctl enable recodex-worker@1.service
|
|
```
|
|
For further information about using systemd please refer to systemd documentation.
|
|
|
|
### Adding new worker
|
|
|
|
To add a new worker you need to do a few steps:
|
|
- Get unique string ID. Think of one.
|
|
- Copy default configuration file `/etc/recodex/worker/config-1.yml` to the same directory and name it `config-<your_unique_ID>.yml`
|
|
- Edit that config file to fit your needs. Note that you must at least change _worker-id_ and _logger file_ values to be unique.
|
|
- Run new instance using
|
|
```
|
|
# systemctl start recodex-worker@<your_unique_ID>.service
|
|
```
|
|
|
|
## Cleaner
|
|
TODO |