recodex-wiki/Worker.md

# Worker

## Description

The worker's job is to securely execute submitted assignments and possibly 
evaluate results against model solutions provided by the exercise author. After 
receiving an evaluation request, worker has to:

- download the archive containing submitted source files and configuration file
- download any supplementary files based on the configuration file, such as test 
  inputs or helper programs (this is done on demand, using a `fetch` command
  in the assignment configuration)
- evaluate the submission according to job configuration
- during evaluation progress messages can be sent back to broker
- upload the results of the evaluation to the fileserver
- notify broker that the evaluation finished

### Header matching

Every worker belongs to exactly one **hardware group** and has a set of **headers**. 
These properties help the broker decide which worker is suitable for processing 
a request. 

The hardware group is a string identifier used to group worker machines with 
similar hardware configuration, for example "i7-4560-quad-ssd". It is 
important for assignments where running times are compared to those of reference 
solutions (we have to make sure that both programs run on simmilar hardware).

The headers are a set of key-value pairs that describe the worker 
capabilities -- which runtime environments are installed, how many threads can 
the worker run or whether it measures time precisely.

These information are sent to the broker on startup using the `init` command.


## Architecture

### Internal communication

Worker is logicaly divided into three parts:

- **Listener** - communicates with broker through 
  [ZeroMQ](http://zeromq.org/). On startup, it introduces itself to the broker. 
  Then it receives new jobs, passes them to the **evaluator** part and sends 
  back results and progress reports.
- **Evaluator** - gets jobs from the **listener** part, evaluates them (possibly 
  in sandbox) and notifies the other part when the evaluation ends. **Evaluator** 
  also communicates with fileserver, downloads supplementary files and 
  uploads detailed results.
- **Progress callback** -- receives information about the progress of an 
  evaluation from the evaluator and forwards them to the broker.

These parts run in separate threads of the same process and communicate through
ZeroMQ in-process sockets. Alternative approach would be using shared memory 
region with unique access, but messaging is generally considered safer. Shared 
memory has to be used very carefully because of race condition issues when 
reading and writing concurrently. Also, messages inside worker are small, so 
there is no big overhead copying data between threads. This multi-threaded 
design allows the worker to keep sending `ping` messages even when it is 
processing a job.

### File management

The messages sent by the broker to assign jobs to workers are rather simple - 
they don't contain any files, only a URL of an archive with a job configuration. 
When processing the job, it may also be necessary to fetch supplementary files 
such as helper scripts or test inputs and outputs.

Supplementary files are addressed using hashes of their content, which allows 
simple caching. Requested files are downloaded into the cache on demand. 
This mechanism is hidden from the job evaluator, which depends on a 
`file_manager_interface` instance. Because the filesystem cache can be shared 
between more workers, cleaning functionality is implemented by the Cleaner 
program that should be set up to run periodically.

### Running student submissions

Student submissions are executed inside sandboxing environment to prevent damage of host system and also to restrict amount of used resources. Now only the Isolate sandbox support is implemented in worker, but there is a possibility of easy extending list of supported sandboxes.

Isolate is executed in separate Linux process created by `fork` and `exec` system calls. Communication between processes is performed through unnamed pipe with standard input and output descriptors redirection. To prevent Isolate failure there is another safety guard -- whole sandbox is killed when it does not end in `(time + 300) * 1.2` seconds for `time` as original maximum time allowed for the task. However, Isolate should allways end itself in time, so this additional safety should never be used.

Sandbox in general has to be command line application taking parameters with arguments, standard input or file. Outputs should be written to file or standard output. There are no other requirements, worker design is very versatile and can be adapted to different needs.


## Configuration and usage

Following text describes how to set up and run **worker** program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd nowadays.

### Default worker configuration

Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml with similar structure to job configuration.

#### Configuration items

Mandatory items are bold, optional italic.

- **worker-id** -- unique identification of worker at one server. This id is used by _isolate_ sanbox on linux systems, so make sure to meet isolate's requirements (default is number from 1 to 999).
- _worker-description_ -- human readable description of this worker
- **broker-uri** -- URI of the broker (hostname, IP address, including port, ...)
- _broker-ping-interval_ -- time interval how often to send ping messages to broker. Used units are milliseconds.
- _max-broker-liveness_ -- specifies how many pings in a row can broker miss without making the worker dead.
- _headers_ -- map of headers specifies worker's capabilities
    - _env_ -- list of enviromental variables which are sent to broker in init command
    - _threads_ -- information about available threads for this worker
- **hwgroup** -- hardware group of this worker. Hardware group must specify worker hardware and software capabilities and it is main item for broker routing decisions.
- _working-directory_ -- where will be stored all needed files. Can be the same for multiple workers on one server.
- **file-managers** -- addresses and credentials to all file managers used (eq. all different frontends using this worker)
    - **hostname** -- URI of file manager
    - _username_ -- username for http authentication (if needed)
    - _password_ -- password for http authentication (if needed)
- _file-cache_ -- configuration of caching feature
    - _cache-dir_ -- path to caching directory. Can be the same for multiple workers.
- _logger_ -- settings of logging capabilities
    - _file_ -- path to the logging file with name without suffix. `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, ...
    - _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug`
    - _max-size_ -- maximal size of log file before rotating
    - _rotations_ -- number of rotation kept
- _limits_ -- default sandbox limits for this worker. All items are described in assignments section in job configuration description. If some limits are not set in job configuration, defaults from worker config will be used. In such case the worker's defaults will be set as the maximum for the job. Also, limits in job configuration cannot exceed limits from worker.

#### Example config file

```{.yml}
worker-id: 1
broker-uri: tcp://localhost:9657
broker-ping-interval: 10  # milliseconds
max-broker-liveness: 10
headers:
    env:
        - c
        - cpp
    threads: 2
hwgroup: "group1"
working-directory: /tmp/recodex
file-managers:
    - hostname: "http://localhost:9999"  # port is optional
      username: ""  # can be ignored in specific modules
      password: ""  # can be ignored in specific modules
file-cache:  # only in case that there is cache module
    cache-dir: "/tmp/recodex/cache"
logger:
    file: "/var/log/recodex/worker" # w/o suffix - actual names will
	                                # be worker.log, worker.1.log,...
    level: "debug"  # level of logging
    max-size: 1048576  # 1 MB; max size of file before log rotation
    rotations: 3  # number of rotations kept
limits:
    time: 5  # in secs
    wall-time: 6  # seconds
    extra-time: 2  # seconds
    stack-size: 0  # normal in KB, but 0 means no special limit
    memory: 50000  # in KB
    parallel: 1
    disk-size: 50
    disk-files: 5
    environ-variable:
        ISOLATE_BOX: "/box"
        ISOLATE_TMP: "/tmp"
    bound-directories:
        - src: /tmp/recodex/eval_5
          dst: /evaluate
          mode: RW,NOEXEC
```


## Sandboxes

### Isolate

Isolate is used as one and only sandbox for linux-based operating systems. Headquarters of this project can be found at [GitHub](https://github.com/ioi/isolate) and more of its installation and setup can be found in [installation](#installation) section. Isolate uses linux kernel features for sandboxing and thus its security depends on them, namely _kernel namespaces_ and _cgroups_ are used. Similar functionality can now be partially achieved with systemd.

From the very beginning of ReCodEx project there was sure that Isolate sandbox for Linux environment will be used. There is no suitable general purpose sandbox on Windows platform, so main operation system of whole backend should be linux-based. Set of supported operations in Isolate seems reasonable for every sandbox, so most of its functionality is accessible from job configuration. As there is no other sandbox, naming often reflects Isolate's names. However worker is prepared to run on Windows too, so integrating with other sandboxes (as libraries or commandline tools) is possible.

Isolate as sandbox provides wide scale of functionality which can be used to limit resources or even cut off particular resources from sandboxed program. There is of course basics like limiting cpu-time and memory consumption, but there can be found also wall-time (human perception of time) or extra-time which is extra limit added to other time limits to increase chance of successful exiting of sandboxed program. From other features there is limiting stack-size, redirection of stdin, stdout or stderr from/to a file. Worth of mentioning is also defining number of processes/threads which can be created or defining environment variables which are passed to sandboxed program.

Chapter by itself is filesystem handling. Isolate uses mount kernel namespace to create "virtual" filesystem which will be mounted in sandboxed program. By default there are only few read-only files/directories mapped into sandbox (described in Isolate man-page). This can be of course changed by providing another numerous folders as isolate parameters. By default folders are mapped as read-only but Isolate has few access options which can be set to some mount point.

#### Limit isolate boxes to particular cpu or memory node

New feature in version 1.3 is possibility of limit Isolate box to one or more cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and `cpuset.mems` which should be just fine for sandbox purposes. As kernel functionality further description can be found in manual page of _cpuset_ or in Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As previously stated this settings can be applied for particular isolate boxes and has to be written in isolate configuration. Standard configuration path should be `/usr/local/etc/isolate` but it may depend on your installation process. Configuration of _cpuset_ in there is really simple and is described in example below.

```
box0.cpus = 0  # assign processor with ID 0 to isolate box with ID 0
box0.mems = 0  # assign memory node with ID 0
# if not set, linux by itself will decide where should
# the sandboxed programs run at
box2.cpus = 1-3  # assign range of processors to isolate box 2
box2.mems = 4-7  # assign range of memory nodes 
box3.cpus = 1,2,3  # assign list of processors to isolate box 3
```

- **cpuset.cpus:** Cpus limitation will restrict sandboxed program only to processor threads set in configuration. On hyperthreaded processors this means that all virtual threads are assignable, not only the physical ones. Value can be represented by single number, list of numbers separated by commas or range with hyphen delimiter.
- **cpuset.mems:** This value is particularly handy on NUMA systems which has several memory nodes. On standard desktop computers this value should always be zero because only one independent memory node is present. As stated in `cpus` limitation there can be single value, list of values separated by comma or range stated with hyphen.

### WrapSharp

WrapSharp is sandbox for programs in C# written also in C#. We have written it as a proof of concept sandbox for using in Windows environment. However, it is not properly tested and integrated to the worker yet. Security audit should be done before using in production. After that, with just a little bit of effort integrating into worker there can be a running sandbox for C# programs on Windows system.


## Cleaner

### Description

Cleaner is integral part of worker which manages its cache folder, mainly deletes outdated files. Every cleaner instance maintains one cache folder, which can be used by multiple workers. This means on one server there can be numerous instances of workers with the same cache folder, but there should be only one cleaner.

Cleaner is written in Python programming language and is used as simple script which just does its job and ends, so has to be cronned. For proper function of cleaner some suitable cronning interval has to be used. It is recommended to use 24 hour interval which should be sufficient enough.

#### Last access timestamp

There is a bit of catch with cleaner service, to work properly, server filesystem has to have enabled last access timestamp. Cleaner checks these stamps and based on them it decides if file will be deleted or not, simple write timestamp or created at timestamp are not enough to reflect real usage and need of particular file. Last access timestamp feature is a bit controversial (more on this subject can be found [here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)) and it is not by default enabled on conventional filesystems. In linux this can be solved by adding `strictatime` option to `fstab` file. On Windows following command has to be executed (as administrator) `fsutil behavior set disablelastaccess 0`.

Another possibility seems to be to update last modified timestamp when accessing the file. This timestamp is used in most major filesystems, so there are less issues with compatibility than last access timestamp. The modified timestamp then must be updated by workers at each access, for example using `touch` command or similar. Final decision on better of these ways will be made after practical experience of running production system.

### Configuration and usage

#### Configuration items
- **cache-dir** -- directory which cleaner manages
- **file-age** -- file age in seconds which are considered outdated and will be deleted

#### Example configuration
```{.yml}
cache-dir: "/tmp"
file-age: "3600"  # in seconds
```
init 8 years ago			`# Worker`

			`## Description`
misc fixes, documentation on headers and hardware groups 8 years ago
parts of Worker architecture rewritten 8 years ago			`The worker's job is to securely execute submitted assignments and possibly`
			`evaluate results against model solutions provided by the exercise author. After`
			`receiving an evaluation request, worker has to:`
description and architecture 8 years ago
Worker section polished 8 years ago			`- download the archive containing submitted source files and configuration file`
			`- download any supplementary files based on the configuration file, such as test`
			inputs or helper programs (this is done on demand, using a `fetch` command
description and architecture 8 years ago			`in the assignment configuration)`
parts of Worker architecture rewritten 8 years ago			`- evaluate the submission according to job configuration`
Worker section polished 8 years ago			`- during evaluation progress messages can be sent back to broker`
			`- upload the results of the evaluation to the fileserver`
			`- notify broker that the evaluation finished`
init 8 years ago
misc fixes, documentation on headers and hardware groups 8 years ago			`### Header matching`

Worker section polished 8 years ago			`Every worker belongs to exactly one hardware group and has a set of headers.`
misc fixes, documentation on headers and hardware groups 8 years ago			`These properties help the broker decide which worker is suitable for processing`
			`a request.`

Worker section polished 8 years ago			`The hardware group is a string identifier used to group worker machines with`
typos 8 years ago			`similar hardware configuration, for example "i7-4560-quad-ssd". It is`
misc fixes, documentation on headers and hardware groups 8 years ago			`important for assignments where running times are compared to those of reference`
Worker section polished 8 years ago			`solutions (we have to make sure that both programs run on simmilar hardware).`
misc fixes, documentation on headers and hardware groups 8 years ago
Worker section polished 8 years ago			`The headers are a set of key-value pairs that describe the worker`
misc fixes, documentation on headers and hardware groups 8 years ago			`capabilities -- which runtime environments are installed, how many threads can`
			`the worker run or whether it measures time precisely.`

Worker section polished 8 years ago			These information are sent to the broker on startup using the `init` command.

misc fixes, documentation on headers and hardware groups 8 years ago
init 8 years ago			`## Architecture`
Worker section polished 8 years ago
skeleton of new worker architecture section 8 years ago			`### Internal communication`
Comment out SVG graphics 8 years ago
parts of Worker architecture rewritten 8 years ago			`Worker is logicaly divided into three parts:`

			`- Listener - communicates with broker through`
			`[ZeroMQ](http://zeromq.org/). On startup, it introduces itself to the broker.`
			`Then it receives new jobs, passes them to the evaluator part and sends`
			`back results and progress reports.`
			`- Evaluator - gets jobs from the listener part, evaluates them (possibly`
			`in sandbox) and notifies the other part when the evaluation ends. Evaluator`
			`also communicates with fileserver, downloads supplementary files and`
			`uploads detailed results.`
			`- Progress callback -- receives information about the progress of an`
			`evaluation from the evaluator and forwards them to the broker.`

			`These parts run in separate threads of the same process and communicate through`
			`ZeroMQ in-process sockets. Alternative approach would be using shared memory`
			`region with unique access, but messaging is generally considered safer. Shared`
			`memory has to be used very carefully because of race condition issues when`
			`reading and writing concurrently. Also, messages inside worker are small, so`
			`there is no big overhead copying data between threads. This multi-threaded`
			design allows the worker to keep sending `ping` messages even when it is
			`processing a job.`
skeleton of new worker architecture section 8 years ago
			`### File management`

parts of Worker architecture rewritten 8 years ago			`The messages sent by the broker to assign jobs to workers are rather simple -`
			`they don't contain any files, only a URL of an archive with a job configuration.`
			`When processing the job, it may also be necessary to fetch supplementary files`
			`such as helper scripts or test inputs and outputs.`

			`Supplementary files are addressed using hashes of their content, which allows`
typos 8 years ago			`simple caching. Requested files are downloaded into the cache on demand.`
parts of Worker architecture rewritten 8 years ago			`This mechanism is hidden from the job evaluator, which depends on a`
			`file_manager_interface` instance. Because the filesystem cache can be shared
			`between more workers, cleaning functionality is implemented by the Cleaner`
			`program that should be set up to run periodically.`
skeleton of new worker architecture section 8 years ago
			`### Running student submissions`

Add last part to worker architecture 8 years ago			`Student submissions are executed inside sandboxing environment to prevent damage of host system and also to restrict amount of used resources. Now only the Isolate sandbox support is implemented in worker, but there is a possibility of easy extending list of supported sandboxes.`

typos 8 years ago			Isolate is executed in separate Linux process created by `fork` and `exec` system calls. Communication between processes is performed through unnamed pipe with standard input and output descriptors redirection. To prevent Isolate failure there is another safety guard -- whole sandbox is killed when it does not end in `(time + 300) * 1.2` seconds for `time` as original maximum time allowed for the task. However, Isolate should allways end itself in time, so this additional safety should never be used.
Add last part to worker architecture 8 years ago
typos 8 years ago			`Sandbox in general has to be command line application taking parameters with arguments, standard input or file. Outputs should be written to file or standard output. There are no other requirements, worker design is very versatile and can be adapted to different needs.`
Add last part to worker architecture 8 years ago
init 8 years ago
configuration and usage basic desc 8 years ago			`## Configuration and usage`
configuration and usage 8 years ago
typos 8 years ago			`Following text describes how to set up and run worker program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd nowadays.`
configuration and usage 8 years ago
			`### Default worker configuration`

Worker section polished 8 years ago			`Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml with similar structure to job configuration.`
configuration and usage 8 years ago
			`#### Configuration items`

			`Mandatory items are bold, optional italic.`

Typographic fixes 8 years ago			`- worker-id -- unique identification of worker at one server. This id is used by _isolate_ sanbox on linux systems, so make sure to meet isolate's requirements (default is number from 1 to 999).`
worker-description 8 years ago			`- _worker-description_ -- human readable description of this worker`
Typographic fixes 8 years ago			`- broker-uri -- URI of the broker (hostname, IP address, including port, ...)`
			`- _broker-ping-interval_ -- time interval how often to send ping messages to broker. Used units are milliseconds.`
			`- _max-broker-liveness_ -- specifies how many pings in a row can broker miss without making the worker dead.`
			`- _headers_ -- map of headers specifies worker's capabilities`
			`- _env_ -- list of enviromental variables which are sent to broker in init command`
			`- _threads_ -- information about available threads for this worker`
			`- hwgroup -- hardware group of this worker. Hardware group must specify worker hardware and software capabilities and it is main item for broker routing decisions.`
			`- _working-directory_ -- where will be stored all needed files. Can be the same for multiple workers on one server.`
			`- file-managers -- addresses and credentials to all file managers used (eq. all different frontends using this worker)`
			`- hostname -- URI of file manager`
			`- _username_ -- username for http authentication (if needed)`
			`- _password_ -- password for http authentication (if needed)`
			`- _file-cache_ -- configuration of caching feature`
			`- _cache-dir_ -- path to caching directory. Can be the same for multiple workers.`
			`- _logger_ -- settings of logging capabilities`
			- _file_ -- path to the logging file with name without suffix. `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, ...
			- _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug`
			`- _max-size_ -- maximal size of log file before rotating`
			`- _rotations_ -- number of rotation kept`
			`- _limits_ -- default sandbox limits for this worker. All items are described in assignments section in job configuration description. If some limits are not set in job configuration, defaults from worker config will be used. In such case the worker's defaults will be set as the maximum for the job. Also, limits in job configuration cannot exceed limits from worker.`
configuration and usage 8 years ago
			`#### Example config file`

			```{.yml}
			`worker-id: 1`
			`broker-uri: tcp://localhost:9657`
			`broker-ping-interval: 10 # milliseconds`
			`max-broker-liveness: 10`
			`headers:`
			`env:`
			`- c`
Worker section polished 8 years ago			`- cpp`
configuration and usage 8 years ago			`threads: 2`
			`hwgroup: "group1"`
			`working-directory: /tmp/recodex`
			`file-managers:`
			`- hostname: "http://localhost:9999" # port is optional`
			`username: "" # can be ignored in specific modules`
			`password: "" # can be ignored in specific modules`
			`file-cache: # only in case that there is cache module`
			`cache-dir: "/tmp/recodex/cache"`
			`logger:`
Worker section polished 8 years ago			`file: "/var/log/recodex/worker" # w/o suffix - actual names will`
			`# be worker.log, worker.1.log,...`
configuration and usage 8 years ago			`level: "debug" # level of logging`
			`max-size: 1048576 # 1 MB; max size of file before log rotation`
			`rotations: 3 # number of rotations kept`
			`limits:`
			`time: 5 # in secs`
			`wall-time: 6 # seconds`
			`extra-time: 2 # seconds`
			`stack-size: 0 # normal in KB, but 0 means no special limit`
			`memory: 50000 # in KB`
Worker section polished 8 years ago			`parallel: 1`
configuration and usage 8 years ago			`disk-size: 50`
			`disk-files: 5`
			`environ-variable:`
			`ISOLATE_BOX: "/box"`
			`ISOLATE_TMP: "/tmp"`
			`bound-directories:`
			`- src: /tmp/recodex/eval_5`
			`dst: /evaluate`
			`mode: RW,NOEXEC`
			```

cleaner 8 years ago
sandboxes todo 8 years ago
			`## Sandboxes`
cpusets isolate feature 8 years ago
			`### Isolate`
Worker section polished 8 years ago
Monitor section improved 8 years ago			`Isolate is used as one and only sandbox for linux-based operating systems. Headquarters of this project can be found at [GitHub](https://github.com/ioi/isolate) and more of its installation and setup can be found in [installation](#installation) section. Isolate uses linux kernel features for sandboxing and thus its security depends on them, namely _kernel namespaces_ and _cgroups_ are used. Similar functionality can now be partially achieved with systemd.`
cpusets isolate feature 8 years ago
Fixes 8 years ago			From the very beginning of ReCodEx project there was sure that Isolate sandbox for Linux environment will be used. There is no suitable general purpose sandbox on Windows platform, so main operation system of whole backend should be linux-based. Set of supported operations in Isolate seems reasonable for every sandbox, so most of its functionality is accessible from job configuration. As there is no other sandbox, naming often reflects Isolate's names. However worker is prepared to run on Windows too, so integrating with other sandboxes (as libraries or commandline tools) is possible.
isolate 8 years ago
Monitor section improved 8 years ago			Isolate as sandbox provides wide scale of functionality which can be used to limit resources or even cut off particular resources from sandboxed program. There is of course basics like limiting cpu-time and memory consumption, but there can be found also wall-time (human perception of time) or extra-time which is extra limit added to other time limits to increase chance of successful exiting of sandboxed program. From other features there is limiting stack-size, redirection of stdin, stdout or stderr from/to a file. Worth of mentioning is also defining number of processes/threads which can be created or defining environment variables which are passed to sandboxed program.
isolate 8 years ago
Monitor section improved 8 years ago			`Chapter by itself is filesystem handling. Isolate uses mount kernel namespace to create "virtual" filesystem which will be mounted in sandboxed program. By default there are only few read-only files/directories mapped into sandbox (described in Isolate man-page). This can be of course changed by providing another numerous folders as isolate parameters. By default folders are mapped as read-only but Isolate has few access options which can be set to some mount point.`
cpusets isolate feature 8 years ago
			`#### Limit isolate boxes to particular cpu or memory node`
Worker section polished 8 years ago
typos 8 years ago			New feature in version 1.3 is possibility of limit Isolate box to one or more cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and `cpuset.mems` which should be just fine for sandbox purposes. As kernel functionality further description can be found in manual page of _cpuset_ or in Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As previously stated this settings can be applied for particular isolate boxes and has to be written in isolate configuration. Standard configuration path should be `/usr/local/etc/isolate` but it may depend on your installation process. Configuration of _cpuset_ in there is really simple and is described in example below.
cpusets isolate feature 8 years ago
			```
typos 8 years ago			`box0.cpus = 0 # assign processor with ID 0 to isolate box with ID 0`
			`box0.mems = 0 # assign memory node with ID 0`
Worker section polished 8 years ago			`# if not set, linux by itself will decide where should`
			`# the sandboxed programs run at`
			`box2.cpus = 1-3 # assign range of processors to isolate box 2`
cpusets isolate feature 8 years ago			`box2.mems = 4-7 # assign range of memory nodes`
Worker section polished 8 years ago			`box3.cpus = 1,2,3 # assign list of processors to isolate box 3`
cpusets isolate feature 8 years ago			```

Monitor section improved 8 years ago			`- cpuset.cpus: Cpus limitation will restrict sandboxed program only to processor threads set in configuration. On hyperthreaded processors this means that all virtual threads are assignable, not only the physical ones. Value can be represented by single number, list of numbers separated by commas or range with hyphen delimiter.`
			- cpuset.mems: This value is particularly handy on NUMA systems which has several memory nodes. On standard desktop computers this value should always be zero because only one independent memory node is present. As stated in `cpus` limitation there can be single value, list of values separated by comma or range stated with hyphen.
sandboxes todo 8 years ago
Worker section polished 8 years ago			`### WrapSharp`

typos 8 years ago			`WrapSharp is sandbox for programs in C# written also in C#. We have written it as a proof of concept sandbox for using in Windows environment. However, it is not properly tested and integrated to the worker yet. Security audit should be done before using in production. After that, with just a little bit of effort integrating into worker there can be a running sandbox for C# programs on Windows system.`
Worker section polished 8 years ago
sandboxes todo 8 years ago
cleaner 8 years ago			`## Cleaner`
cleaner description 8 years ago
			`### Description`

Cleaner polishing 8 years ago			`Cleaner is integral part of worker which manages its cache folder, mainly deletes outdated files. Every cleaner instance maintains one cache folder, which can be used by multiple workers. This means on one server there can be numerous instances of workers with the same cache folder, but there should be only one cleaner.`

typos 8 years ago			`Cleaner is written in Python programming language and is used as simple script which just does its job and ends, so has to be cronned. For proper function of cleaner some suitable cronning interval has to be used. It is recommended to use 24 hour interval which should be sufficient enough.`
cleaner description 8 years ago
			`#### Last access timestamp`
Cleaner polishing 8 years ago
typos 8 years ago			There is a bit of catch with cleaner service, to work properly, server filesystem has to have enabled last access timestamp. Cleaner checks these stamps and based on them it decides if file will be deleted or not, simple write timestamp or created at timestamp are not enough to reflect real usage and need of particular file. Last access timestamp feature is a bit controversial (more on this subject can be found [here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)) and it is not by default enabled on conventional filesystems. In linux this can be solved by adding `strictatime` option to `fstab` file. On Windows following command has to be executed (as administrator) `fsutil behavior set disablelastaccess 0`.
cleaner description 8 years ago
Fixes 8 years ago			Another possibility seems to be to update last modified timestamp when accessing the file. This timestamp is used in most major filesystems, so there are less issues with compatibility than last access timestamp. The modified timestamp then must be updated by workers at each access, for example using `touch` command or similar. Final decision on better of these ways will be made after practical experience of running production system.

installation and configuration 8 years ago			`### Configuration and usage`

			`#### Configuration items`
Cleaner polishing 8 years ago			`- cache-dir -- directory which cleaner manages`
			`- file-age -- file age in seconds which are considered outdated and will be deleted`
installation and configuration 8 years ago
			`#### Example configuration`
Cleaner polishing 8 years ago			```{.yml}
installation and configuration 8 years ago			`cache-dir: "/tmp"`
			`file-age: "3600" # in seconds`
			```