From 753e1177cff12e54b6db9eb4ca98b4f9a1adcd00 Mon Sep 17 00:00:00 2001 From: Petr Stefan Date: Thu, 19 Jan 2017 18:28:58 +0100 Subject: [PATCH] System configuration --- Broker.md | 60 ------ Coding-style.md | 122 ----------- Monitor.md | 40 ---- System-configuration.md | 437 ++++++++++++++++++++++++++++++++++++++++ Web-API.md | 127 ------------ Web-application.md | 17 -- Worker.md | 127 ------------ 7 files changed, 437 insertions(+), 493 deletions(-) delete mode 100644 Coding-style.md create mode 100644 System-configuration.md diff --git a/Broker.md b/Broker.md index b71df5c..77b95b2 100644 --- a/Broker.md +++ b/Broker.md @@ -70,63 +70,3 @@ forwarded to the frontend. The same goes for external failures. Jobs that fail internally cannot be reassigned, because the "new" broker does not know their headers -- they are reported as failed immediately. - -## Configuration and usage -Following text describes how to set up and run broker program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd now. - -### Default broker configuration - -#### Configuration items - -Description of configurable items in broker's config. Mandatory items are bold, optional italic. - -- _clients_ -- specifies address and port to bind for clients (frontend instance) - - _address_ -- hostname or IP address as string (`*` for any) - - _port_ -- desired port -- _workers_ -- specifies address and port to bind for workers - - _address_ -- hostname or IP address as string (`*` for any) - - _port_ -- desired port - - _max_liveness_ -- maximum amount of pings the worker can fail to send before it is considered disconnected - - _max_request_failures_ -- maximum number of times a job can fail (due to e.g. worker disconnect or a network error when downloading something from the fileserver) and be assigned again -- _monitor_ -- settings of monitor service connection - - _address_ -- IP address of running monitor service - - _port_ -- desired port -- _notifier_ -- details of connection which is used in case of errors and good to know states - - _address_ -- address where frontend API runs - - _port_ -- desired port - - _username_ -- username which can be used for HTTP authentication - - _password_ -- password which can be used for HTTP authentication -- _logger_ -- settings of logging capabilities - - _file_ -- path to the logging file with name without suffix. `/var/log/recodex/broker` item will produce `broker.log`, `broker.1.log`, ... - - _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug` - - _max-size_ -- maximal size of log file before rotating - - _rotations_ -- number of rotation kept - -#### Example config file - -```{.yml} -# Address and port for clients (frontend) -clients: - address: "*" - port: 9658 # Address and port for workers -workers: - address: "*" - port: 9657 - max_liveness: 10 - max_request_failures: 3 -monitor: - address: "127.0.0.1" - port: 7894 -notifier: - address: "127.0.0.1" - port: 8080 - username: "" - password: "" -logger: - file: "/var/log/recodex/broker" # w/o suffix - actual names will be - # broker.log, broker.1.log, ... - level: "debug" # level of logging - max-size: 1048576 # 1 MB; max size of file before log rotation - rotations: 3 # number of rotations kept -``` - diff --git a/Coding-style.md b/Coding-style.md deleted file mode 100644 index 20d87eb..0000000 --- a/Coding-style.md +++ /dev/null @@ -1,122 +0,0 @@ -# Coding style - -Every project should have some consistent coding style in which all contributors write. Bellow you can find our conventions on which we agreed on and which we try to keep. - -## C++ - -**NOTE, that C++ projects have set code linter (`cmake-format`) with custom format. To reformat code run `make format` inside `build` directory of the project (probably not working on Windows).** For quick introduction into our format, see following paragraphs. - -In C++ is written worker and broker. Generally it is used underscore style with all small letters. Inspired by [Google C++ style guide](https://google.github.io/styleguide/cppguide.html). If something is not defined than naming/formatting can be arbitrary, but should be similar to bellow-defined behaviour. - -### Naming convention -* For source codes use all lower case with underscores not dashes. Header files should end with `.h` and C++ files with `.cpp`. -* Typenames are all in lower case with underscores between words. This is applicable to classes, structs, typedefs, enums and type template parameters. -* Variable names can be divided on local variables and class members. Local variables are all lower case with underscores between words. Class members have in addition trailing underscore on the end (struct data members do not have underscore on the end). -* Constants are just like any other variables and do not have any specifics. -* All function names are again all lower case with underscores between words. -* Namespaces if there are ones they should have lower case and underscores. -* Macros are classical and should have all capitals and underscores. -* Comments can be two types documentational and ordinery ones in code. Documentation should start with `/**` and end with `*/`, convention inside them is javadoc documentation format. Classical comments in code are one liners which starts with `//` and end with the end of the line. - -### Formatting convention -* Line length is not explicitly defined, but should be reasonable. -* All files should use UTF-8 character set. -* For code indentation tabs (`\t`) are used. -* Function declaration/definition: return type should be on the same line as the rest of the declaration, if line is too long, than particular parameters are placed on new line. Opening parenthesis of function should be placed on new line bellow declaration. Its possible to write small function which can be on only one line. Between parameter and comma should be one space. -``` -int run(int id, string msg); - -void print_hello_world() -{ - std::cout << "Hello world" << std::endl; - return; -} - -int get_five() { return 5; } -``` -* Lambda expressions: same formatting as classical functions -``` -auto hello = [](int x) { std::cout << "hello_" << x << std::endl; } -``` -* Function calls: basically same as function header definition. -* Condition: after if, or else there always have to be one space in front of opening bracket and again one space after closing condition bracket (and in front of opening parenthesis). If and else always should be on separate lines. Inside condition there should not be any pointless spaces. -``` -if (x == 5) { - std::cout << "Exactly five!" << std::endl; -} else if (x < 5 && y > 5) { - std::cout << "Whoa, that is weird format!" << std::endl; -} else { - std::cout << "I dont know what is this!" << std::endl; -} -``` -* For and while cycles: basically same rules as for if condition. -* Try-catch blocks: again same rules as for if conditions. Closing parentheses of try block should be on the same line as catch block. -``` -try { - int a = 5 / 0; -} catch (...) { - std::cout << "Division by zero" << std::endl; -} -``` -* Switch: again basics are the same as for if condition. Case statements should not be indented and case body should be intended with 1 tab. -``` -switch (switched) { -case 0: // no tab indent - ... // 1 tab indent - break; -case 1: - ... - break; -default: - exit(1); -} -``` -* Pointers and references: no spaces between period or arrow in accessing type member. No spaces after asterisk or ampersand. In declaration of pointer or reference format should be that asterisk or ampersand is adjacent to name of the variable not type. -``` -number = *ptr; -ptr = &val; -number = ptr->number; -number = val_ref.number; - -int *i; -int &j; - -// bad format bellow -int* i; -int * i; -``` -* Boolean expression: long boolean expression should be divided into more lines. The division point should always be after logical operators. -``` -if (i > 10 && - j < 10 && - k > 20) { - std::cout << "Were here!" << std::endl; -} -``` -* Return values should not be generally wrapped with parentheses, only if needed. -* Preprocessor directives start with `#` and always should start at the beginning of the line. -* Classes: sections aka. public, protected, private should have same indentation as the class start itself. Opening parenthesis of class should be on the same line as class name. -``` -class my_class { -public: - void class_function(); -private: - int class_member_; -}; -``` -* Operators: around all binary operators there always should be spaces. -``` -int x = 5; -x = x * 5 / 5; -x = x + 5 * (10 - 5); -``` - -## Python - -Python code should correspond to [PEP 8](https://www.python.org/dev/peps/pep-0008/) style. - -## PHP -TODO: - -## JavaScript -TODO: \ No newline at end of file diff --git a/Monitor.md b/Monitor.md index 0d26e56..58f3e37 100644 --- a/Monitor.md +++ b/Monitor.md @@ -28,43 +28,3 @@ There can be multiple receivers to one channel id. Each one has separate _asynci Messages from client's queue are sent through corresponding WebSocket connection via main event loop as soon as possible. This approach with separate queue per connection is easy to implement and guarantees reliability and order of message delivery. - -## Configuration and usage - -### Configuration -Configuration file is located in subdirectory `monitor` of standard ReCodEx configuration folder `/etc/recodex/`. It is in YAML format as all of the other configurations. Format is very similar to configurations of broker or workers. - -### Configuration items - -Description of configurable items, bold ones are required, italics ones are optional. - -- _websocket_uri_ -- URI where is the endpoint of websocket connection. Must be visible to the clients (directly or through public proxy) - - string representation of IP address or a hostname - - port number -- _zeromq_uri_ -- URI where is the endpoint of zeromq connection from broker. Could be hidden from public internet. - - string representation of IP address or a hostname - - port number -- _logger_ -- settings of logging - - _file_ -- path with name of log file. Defaults to `/var/log/recodex/monitor.log` - - _level_ -- logging level, one of "debug", "info", "warning", "error" and "critical" - - _max-size_ -- maximum size of log file before rotation in bytes - - _rotations_ -- number of rotations kept - -### Example configuration file - -```{.yml} ---- -websocket_uri: - - "127.0.0.1" - - 4567 -zeromq_uri: - - "127.0.0.1" - - 7894 -logger: - file: "/var/log/recodex/monitor.log" - level: "debug" - max-size: 1048576 # 1 MB - rotations: 3 -... -``` - diff --git a/System-configuration.md b/System-configuration.md new file mode 100644 index 0000000..5476013 --- /dev/null +++ b/System-configuration.md @@ -0,0 +1,437 @@ +# System configuration + +## Worker + +Worker should have some default configuration which is applied to worker itself +or may be used in given jobs (implicitly if something is missing, or explicitly +with special variables). This configuration should be hardcoded and can be +rewritten by explicitly declared configuration file. Format of this +configuration is yaml with similar structure to job configuration. + +### Configuration items + +Mandatory items are bold, optional italic. + +- **worker-id** -- unique identification of worker at one server. This id is + used by _isolate_ sanbox on linux systems, so make sure to meet isolate's + requirements (default is number from 1 to 999). +- _worker-description_ -- human readable description of this worker +- **broker-uri** -- URI of the broker (hostname, IP address, including port, + ...) +- _broker-ping-interval_ -- time interval how often to send ping messages to + broker. Used units are milliseconds. +- _max-broker-liveness_ -- specifies how many pings in a row can broker miss + without making the worker dead. +- _headers_ -- map of headers specifies worker's capabilities + - _env_ -- list of enviromental variables which are sent to broker in init + command + - _threads_ -- information about available threads for this worker +- **hwgroup** -- hardware group of this worker. Hardware group must specify + worker hardware and software capabilities and it is main item for broker + routing decisions. +- _working-directory_ -- where will be stored all needed files. Can be the same + for multiple workers on one server. +- **file-managers** -- addresses and credentials to all file managers used (eq. + all different frontends using this worker) + - **hostname** -- URI of file manager + - _username_ -- username for http authentication (if needed) + - _password_ -- password for http authentication (if needed) +- _file-cache_ -- configuration of caching feature + - _cache-dir_ -- path to caching directory. Can be the same for multiple + workers. +- _logger_ -- settings of logging capabilities + - _file_ -- path to the logging file with name without suffix. + `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, + ... + - _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, + `err`, `warn`, `notice`, `info` and `debug` + - _max-size_ -- maximal size of log file before rotating + - _rotations_ -- number of rotation kept +- _limits_ -- default sandbox limits for this worker. All items are described in + assignments section in job configuration description. If some limits are not + set in job configuration, defaults from worker config will be used. In such + case the worker's defaults will be set as the maximum for the job. Also, + limits in job configuration cannot exceed limits from worker. + +### Example config file + +```{.yml} +worker-id: 1 +broker-uri: tcp://localhost:9657 +broker-ping-interval: 10 # milliseconds +max-broker-liveness: 10 +headers: + env: + - c + - cpp + threads: 2 +hwgroup: "group1" +working-directory: /tmp/recodex +file-managers: + - hostname: "http://localhost:9999" # port is optional + username: "" # can be ignored in specific modules + password: "" # can be ignored in specific modules +file-cache: # only in case that there is cache module + cache-dir: "/tmp/recodex/cache" +logger: + file: "/var/log/recodex/worker" # w/o suffix - actual names will + # be worker.log, worker.1.log,... + level: "debug" # level of logging + max-size: 1048576 # 1 MB; max size of file before log rotation + rotations: 3 # number of rotations kept +limits: + time: 5 # in secs + wall-time: 6 # seconds + extra-time: 2 # seconds + stack-size: 0 # normal in KB, but 0 means no special limit + memory: 50000 # in KB + parallel: 1 + disk-size: 50 + disk-files: 5 + environ-variable: + ISOLATE_BOX: "/box" + ISOLATE_TMP: "/tmp" + bound-directories: + - src: /tmp/recodex/eval_5 + dst: /evaluate + mode: RW,NOEXEC +``` + +### Isolate sandbox + +New feature in version 1.3 is possibility of limit Isolate box to one or more +cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism +and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and +`cpuset.mems` which should be just fine for sandbox purposes. As kernel +functionality further description can be found in manual page of _cpuset_ or in +Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As +previously stated this settings can be applied for particular isolate boxes and +has to be written in isolate configuration. Standard configuration path should +be `/usr/local/etc/isolate` but it may depend on your installation process. +Configuration of _cpuset_ in there is really simple and is described in example +below. + +``` +box0.cpus = 0 # assign processor with ID 0 to isolate box with ID 0 +box0.mems = 0 # assign memory node with ID 0 +# if not set, linux by itself will decide where should +# the sandboxed programs run at +box2.cpus = 1-3 # assign range of processors to isolate box 2 +box2.mems = 4-7 # assign range of memory nodes +box3.cpus = 1,2,3 # assign list of processors to isolate box 3 +``` + +- **cpuset.cpus:** Cpus limitation will restrict sandboxed program only to + processor threads set in configuration. On hyperthreaded processors this means + that all virtual threads are assignable, not only the physical ones. Value can + be represented by single number, list of numbers separated by commas or range + with hyphen delimiter. +- **cpuset.mems:** This value is particularly handy on NUMA systems which has + several memory nodes. On standard desktop computers this value should always + be zero because only one independent memory node is present. As stated in + `cpus` limitation there can be single value, list of values separated by comma + or range stated with hyphen. + +## Broker + +### Configuration items + +Description of configurable items in broker's config. Mandatory items are bold, +optional italic. + +- _clients_ -- specifies address and port to bind for clients (frontend + instance) + - _address_ -- hostname or IP address as string (`*` for any) + - _port_ -- desired port +- _workers_ -- specifies address and port to bind for workers + - _address_ -- hostname or IP address as string (`*` for any) + - _port_ -- desired port + - _max_liveness_ -- maximum amount of pings the worker can fail to send + before it is considered disconnected + - _max_request_failures_ -- maximum number of times a job can fail (due to + e.g. worker disconnect or a network error when downloading something from + the fileserver) and be assigned again +- _monitor_ -- settings of monitor service connection + - _address_ -- IP address of running monitor service + - _port_ -- desired port +- _notifier_ -- details of connection which is used in case of errors and good + to know states + - _address_ -- address where frontend API runs + - _port_ -- desired port + - _username_ -- username which can be used for HTTP authentication + - _password_ -- password which can be used for HTTP authentication +- _logger_ -- settings of logging capabilities + - _file_ -- path to the logging file with name without suffix. + `/var/log/recodex/broker` item will produce `broker.log`, `broker.1.log`, + ... + - _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, + `err`, `warn`, `notice`, `info` and `debug` + - _max-size_ -- maximal size of log file before rotating + - _rotations_ -- number of rotation kept + +### Example config file + +```{.yml} +# Address and port for clients (frontend) +clients: + address: "*" + port: 9658 +# Address and port for workers +workers: + address: "*" + port: 9657 + max_liveness: 10 + max_request_failures: 3 +monitor: + address: "127.0.0.1" + port: 7894 +notifier: + address: "127.0.0.1" + port: 8080 + username: "" + password: "" +logger: + file: "/var/log/recodex/broker" # w/o suffix - actual names will be + # broker.log, broker.1.log, ... + level: "debug" # level of logging + max-size: 1048576 # 1 MB; max size of file before log rotation + rotations: 3 # number of rotations kept +``` + +## Monitor + +Configuration file is located in subdirectory `monitor` of standard ReCodEx +configuration folder `/etc/recodex/`. It is in YAML format as all of the other +configurations. Format is very similar to configurations of broker or workers. + +### Configuration items + +Description of configurable items, bold ones are required, italics ones are +optional. + +- _websocket_uri_ -- URI where is the endpoint of websocket connection. Must be + visible to the clients (directly or through public proxy) + - string representation of IP address or a hostname + - port number +- _zeromq_uri_ -- URI where is the endpoint of zeromq connection from broker. + Could be hidden from public internet. + - string representation of IP address or a hostname + - port number +- _logger_ -- settings of logging + - _file_ -- path with name of log file. Defaults to + `/var/log/recodex/monitor.log` + - _level_ -- logging level, one of "debug", "info", "warning", "error" and + "critical" + - _max-size_ -- maximum size of log file before rotation in bytes + - _rotations_ -- number of rotations kept + +### Example configuration file + +```{.yml} +--- +websocket_uri: + - "127.0.0.1" + - 4567 +zeromq_uri: + - "127.0.0.1" + - 7894 +logger: + file: "/var/log/recodex/monitor.log" + level: "debug" + max-size: 1048576 # 1 MB + rotations: 3 +... +``` + +## Cleaner + +### Configuration items +- **cache-dir** -- directory which cleaner manages +- **file-age** -- file age in seconds which are considered outdated and will be deleted + +### Example configuration +```{.yml} +cache-dir: "/tmp" +file-age: "3600" # in seconds +``` + +## REST API + +The API can be configured in `config.neon` and `config.local.neon` files in +`app/config` directory. The first file is predefined by authors and should not +be modified. The second one is not present and could be created by copying +`config.local.neon.example` template in the config directory. Local +configuration have higher precedence, so it will override default values from +`config.neon`. + +### Configurable items + +Description of configurable items. All timeouts are in milliseconds if not +stated otherwise. + +- accessManager -- configuration of access token in [JWT + standard](https://www.rfc-editor.org/rfc/rfc7519.txt). Do **not** modify + unless you really know what are you doing. +- fileServer -- connection to fileserver + - address -- URI of fileserver + - auth -- _username_ and _password_ for HTTP basic authentication + - timeouts -- _connection_ timeout for establishing new connection and + _request_ timeout for completing one request +- broker -- connection to broker + - address -- URI of broker + - auth -- _username_ and _password_ for broker callback authentication back + to API + - timeouts -- _ack_ timeout for first response that broker receives the + message, _send_ timeout how long try to send new job to the broker and + _result_ timeout how long to wait for confirmation if job can be processed + or not +- monitor -- connection to monitor + - address -- URI of monitor +- CAS -- CAS external authentication + - serviceId -- visible identifier of this service + - ldapConnection -- parameters for connecting to LDAP, _hostname_, + _base_dn_, _port_, _security_ and _bindName_ + - fields -- names of LDAP keys for informations as _email_, _firstName_ and + _lastName_ +- emails -- common configuration for sending email (addresses and template + variables) + - apiUrl -- base URL of API server including port (for referencing pictures + in messages) + - footerUrl -- link in the message footer + - siteName -- name of frontend (ReCodEx, or KSP for unique instance for KSP + course) + - githubUrl -- URL to GitHub repository of this project + - from -- sending email address +- failures -- admin messages on errors + - emails -- additional info for sending mails, _to_ is admin mail address, + _from_ is source address, _subjectPrefix_ is prefix of mail subject +- forgottenPassword -- user messages for changing passwords + - redirectUrl -- URL of web application where the password can be changed + - tokenExpiration -- expiration timeout of temporary token (in seconds) + - emails -- additional info for sending mails, _from_ is source address and + _subjectPrefix_ is prefix of mail subject +- mail -- configuration of sending mails + - smtp -- using SMTP server, have to be "true" + - host -- address of the server + - port -- sending port (common values are 25, 465, 587) + - username -- login to the server + - password -- password to the server + - secure -- security, values are empty for no security, "ssl" or "tls" + - context -- additional parameters, depending on used mail engine. For + examle self-signed certificates can be allowed as _verify_peer_ and + _verify_peer_name_ to false and _allow_self_signed_ to true under _ssl_ + key (see example). + +Outside the parameters section of configuration is configuration for Doctrine. +It is ORM framework which maps PHP objects (entities) into database tables and +rows. The configuration is simple, required items are only _user_, _password_ +and _host_ with _dbname_, i.e. address of database computer (mostly localhost) +with name of ReCodEx database. + +### Example local configuration file + +```{.yml} +parameters: + accessManager: + leeway: 60 + issuer: https://recodex.projekty.ms.mff.cuni.cz + audience: https://recodex.projekty.ms.mff.cuni.cz + expiration: 86400 # 24 hours in seconds + usedAlgorithm: HS256 + allowedAlgorithms: + - HS256 + verificationKey: "recodex-123" + fileServer: + address: http://127.0.0.1:9999 + auth: + username: "user" + password: "pass" + timeouts: + connection: 500 + broker: + address: tcp://127.0.0.1:9658 + auth: + username: "user" + password: "pass" + timeouts: + ack: 100 + send: 5000 + result: 1000 + monitor: + address: wss://recodex.projekty.ms.mff.cuni.cz:4443/ws + CAS: + serviceId: "cas-uk" + ldapConnection: + hostname: "ldap.cuni.cz" + base_dn: "ou=people,dc=cuni,dc=cz" + port: 389 + security: SSL + bindName: "cunipersonalid" + fields: + email: "mail" + firstName: "givenName" + lastName: "sn" + emails: + apiUrl: https://recodex.projekty.ms.mff.cuni.cz:4000 + footerUrl: https://recodex.projekty.ms.mff.cuni.cz + siteName: "ReCodEx" + githubUrl: https://github.com/ReCodEx + from: "ReCodEx " + failures: + emails: + to: "Admin Name " + from: %emails.from% + subjectPrefix: "ReCodEx Failure Report - " + forgottenPassword: + redirectUrl: "https://recodex.projekty.ms.mff.cuni.cz/ + forgotten-password/change" + tokenExpiration: 600 # 10 minues + emails: + from: %emails.from% + subjectPrefix: "ReCodEx Forgotten Password Request - " + mail: + smtp: true + host: "smtp.ps.stdin.cz" + port: 587 + username: "user" + password: "pass" + secure: "tls" + context: + ssl: + verify_peer: false + verify_peer_name: false + allow_self_signed: true +doctrine: + user: "user" + password: "pass" + host: localhost + dbname: "recodex-api" +``` + +## Web application + +### Configurable items + +Description of configurable options. Bold are required values, optional ones are +in italics. + +- **NODE_ENV** -- mode of the server +- **API_BASE** -- base address of API server, including port and API version +- **PORT** -- port where the app is listening +- _WEBPACK_DEV_SERVER_PORT_ -- port for webpack dev server when running in + development mode. Default one is 8081, this option might be useful when this + port is necessary for some other service. + +### Example configuration file + +``` +NODE_ENV=production +API_BASE=https://recodex.projekty.ms.mff.cuni.cz:4000/v1 +PORT=8080 +``` + + + + diff --git a/Web-API.md b/Web-API.md index e02aad5..7202516 100644 --- a/Web-API.md +++ b/Web-API.md @@ -88,130 +88,3 @@ both our internal login service and CAS. An advantage of this approach is being able control the authentication process completely instead of just receiving session data through a global variable. - -## Configuration and usage - -The API can be configured in `config.neon` and `config.local.neon` files in `app/config` directory. The first file is predefined by authors and should not be modified. The second one is not present and could be created by copying `config.local.neon.example` template in the config directory. Local configuration have higher precedence, so it will override default values from `config.neon`. - -### Configurable items - -Description of configurable items. All timeouts are in milliseconds if not stated otherwise. - -- accessManager -- configuration of access token in [JWT standard](https://www.rfc-editor.org/rfc/rfc7519.txt). Do **not** modify unless you really know what are you doing. -- fileServer -- connection to fileserver - - address -- URI of fileserver - - auth -- _username_ and _password_ for HTTP basic authentication - - timeouts -- _connection_ timeout for establishing new connection and _request_ timeout for completing one request -- broker -- connection to broker - - address -- URI of broker - - auth -- _username_ and _password_ for broker callback authentication back to API - - timeouts -- _ack_ timeout for first response that broker receives the message, _send_ timeout how long try to send new job to the broker and _result_ timeout how long to wait for confirmation if job can be processed or not -- monitor -- connection to monitor - - address -- URI of monitor -- CAS -- CAS external authentication - - serviceId -- visible identifier of this service - - ldapConnection -- parameters for connecting to LDAP, _hostname_, _base_dn_, _port_, _security_ and _bindName_ - - fields -- names of LDAP keys for informations as _email_, _firstName_ and _lastName_ -- emails -- common configuration for sending email (addresses and template variables) - - apiUrl -- base URL of API server including port (for referencing pictures in messages) - - footerUrl -- link in the message footer - - siteName -- name of frontend (ReCodEx, or KSP for unique instance for KSP course) - - githubUrl -- URL to GitHub repository of this project - - from -- sending email address -- failures -- admin messages on errors - - emails -- additional info for sending mails, _to_ is admin mail address, _from_ is source address, _subjectPrefix_ is prefix of mail subject -- forgottenPassword -- user messages for changing passwords - - redirectUrl -- URL of web application where the password can be changed - - tokenExpiration -- expiration timeout of temporary token (in seconds) - - emails -- additional info for sending mails, _from_ is source address and _subjectPrefix_ is prefix of mail subject -- mail -- configuration of sending mails - - smtp -- using SMTP server, have to be "true" - - host -- address of the server - - port -- sending port (common values are 25, 465, 587) - - username -- login to the server - - password -- password to the server - - secure -- security, values are empty for no security, "ssl" or "tls" - - context -- additional parameters, depending on used mail engine. For examle self-signed certificates can be allowed as _verify_peer_ and _verify_peer_name_ to false and _allow_self_signed_ to true under _ssl_ key (see example). - -Outside the parameters section of configuration is configuration for Doctrine. It is ORM framework which maps PHP objects (entities) into database tables and rows. The configuration is simple, required items are only _user_, _password_ and _host_ with _dbname_, i.e. address of database computer (mostly localhost) with name of ReCodEx database. - -### Example local configuration file - -```{.yml} -parameters: - accessManager: - leeway: 60 - issuer: https://recodex.projekty.ms.mff.cuni.cz - audience: https://recodex.projekty.ms.mff.cuni.cz - expiration: 86400 # 24 hours in seconds - usedAlgorithm: HS256 - allowedAlgorithms: - - HS256 - verificationKey: "recodex-123" - fileServer: - address: http://127.0.0.1:9999 - auth: - username: "user" - password: "pass" - timeouts: - connection: 500 - broker: - address: tcp://127.0.0.1:9658 - auth: - username: "user" - password: "pass" - timeouts: - ack: 100 - send: 5000 - result: 1000 - monitor: - address: wss://recodex.projekty.ms.mff.cuni.cz:4443/ws - CAS: - serviceId: "cas-uk" - ldapConnection: - hostname: "ldap.cuni.cz" - base_dn: "ou=people,dc=cuni,dc=cz" - port: 389 - security: SSL - bindName: "cunipersonalid" - fields: - email: "mail" - firstName: "givenName" - lastName: "sn" - emails: - apiUrl: https://recodex.projekty.ms.mff.cuni.cz:4000 - footerUrl: https://recodex.projekty.ms.mff.cuni.cz - siteName: "ReCodEx" - githubUrl: https://github.com/ReCodEx - from: "ReCodEx " - failures: - emails: - to: "Admin Name " - from: %emails.from% - subjectPrefix: "ReCodEx Failure Report - " - forgottenPassword: - redirectUrl: "https://recodex.projekty.ms.mff.cuni.cz/ - forgotten-password/change" - tokenExpiration: 600 # 10 minues - emails: - from: %emails.from% - subjectPrefix: "ReCodEx Forgotten Password Request - " - mail: - smtp: true - host: "smtp.ps.stdin.cz" - port: 587 - username: "user" - password: "pass" - secure: "tls" - context: - ssl: - verify_peer: false - verify_peer_name: false - allow_self_signed: true -doctrine: - user: "user" - password: "pass" - host: localhost - dbname: "recodex-api" -``` - diff --git a/Web-application.md b/Web-application.md index 0162660..84e2eb9 100644 --- a/Web-application.md +++ b/Web-application.md @@ -153,20 +153,3 @@ $ npm run exportStrings ``` This will create *JSON* files with the exported strings for the *'en'* and *'cs'* locale. If you want to export strings for more languages, you must edit the `/manageTranslations.js` script. The exported strings are placed in the `/src/locales` directory. - -## Configurable items - -Description of configurable options. Bold are required values, optional ones are in italics. - -- **NODE_ENV** -- mode of the server -- **API_BASE** -- base address of API server, including port and API version -- **PORT** -- port where the app is listening -- _WEBPACK_DEV_SERVER_PORT_ -- port for webpack dev server when running in development mode. Default one is 8081, this option might be useful when this port is necessary for some other service. - -#### Example configuration file -``` -NODE_ENV=production -API_BASE=https://recodex.projekty.ms.mff.cuni.cz:4000/v1 -PORT=8080 -``` - diff --git a/Worker.md b/Worker.md index c5a5746..1e55aa8 100644 --- a/Worker.md +++ b/Worker.md @@ -82,121 +82,6 @@ Isolate is executed in separate Linux process created by `fork` and `exec` syste Sandbox in general has to be command line application taking parameters with arguments, standard input or file. Outputs should be written to file or standard output. There are no other requirements, worker design is very versatile and can be adapted to different needs. -## Configuration and usage - -Following text describes how to set up and run **worker** program. It is supposed to have required binaries installed. Also, using systemd is recommended for best user experience, but it is not required. Almost all modern Linux distributions are using systemd nowadays. - -### Default worker configuration - -Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml with similar structure to job configuration. - -#### Configuration items - -Mandatory items are bold, optional italic. - -- **worker-id** -- unique identification of worker at one server. This id is used by _isolate_ sanbox on linux systems, so make sure to meet isolate's requirements (default is number from 1 to 999). -- _worker-description_ -- human readable description of this worker -- **broker-uri** -- URI of the broker (hostname, IP address, including port, ...) -- _broker-ping-interval_ -- time interval how often to send ping messages to broker. Used units are milliseconds. -- _max-broker-liveness_ -- specifies how many pings in a row can broker miss without making the worker dead. -- _headers_ -- map of headers specifies worker's capabilities - - _env_ -- list of enviromental variables which are sent to broker in init command - - _threads_ -- information about available threads for this worker -- **hwgroup** -- hardware group of this worker. Hardware group must specify worker hardware and software capabilities and it is main item for broker routing decisions. -- _working-directory_ -- where will be stored all needed files. Can be the same for multiple workers on one server. -- **file-managers** -- addresses and credentials to all file managers used (eq. all different frontends using this worker) - - **hostname** -- URI of file manager - - _username_ -- username for http authentication (if needed) - - _password_ -- password for http authentication (if needed) -- _file-cache_ -- configuration of caching feature - - _cache-dir_ -- path to caching directory. Can be the same for multiple workers. -- _logger_ -- settings of logging capabilities - - _file_ -- path to the logging file with name without suffix. `/var/log/recodex/worker` item will produce `worker.log`, `worker.1.log`, ... - - _level_ -- level of logging, one of `off`, `emerg`, `alert`, `critical`, `err`, `warn`, `notice`, `info` and `debug` - - _max-size_ -- maximal size of log file before rotating - - _rotations_ -- number of rotation kept -- _limits_ -- default sandbox limits for this worker. All items are described in assignments section in job configuration description. If some limits are not set in job configuration, defaults from worker config will be used. In such case the worker's defaults will be set as the maximum for the job. Also, limits in job configuration cannot exceed limits from worker. - -#### Example config file - -```{.yml} -worker-id: 1 -broker-uri: tcp://localhost:9657 -broker-ping-interval: 10 # milliseconds -max-broker-liveness: 10 -headers: - env: - - c - - cpp - threads: 2 -hwgroup: "group1" -working-directory: /tmp/recodex -file-managers: - - hostname: "http://localhost:9999" # port is optional - username: "" # can be ignored in specific modules - password: "" # can be ignored in specific modules -file-cache: # only in case that there is cache module - cache-dir: "/tmp/recodex/cache" -logger: - file: "/var/log/recodex/worker" # w/o suffix - actual names will - # be worker.log, worker.1.log,... - level: "debug" # level of logging - max-size: 1048576 # 1 MB; max size of file before log rotation - rotations: 3 # number of rotations kept -limits: - time: 5 # in secs - wall-time: 6 # seconds - extra-time: 2 # seconds - stack-size: 0 # normal in KB, but 0 means no special limit - memory: 50000 # in KB - parallel: 1 - disk-size: 50 - disk-files: 5 - environ-variable: - ISOLATE_BOX: "/box" - ISOLATE_TMP: "/tmp" - bound-directories: - - src: /tmp/recodex/eval_5 - dst: /evaluate - mode: RW,NOEXEC -``` - - - -## Sandboxes - -### Isolate - -Isolate is used as one and only sandbox for linux-based operating systems. Headquarters of this project can be found at [GitHub](https://github.com/ioi/isolate) and more of its installation and setup can be found in [installation](#installation) section. Isolate uses linux kernel features for sandboxing and thus its security depends on them, namely _kernel namespaces_ and _cgroups_ are used. Similar functionality can now be partially achieved with systemd. - -From the very beginning of ReCodEx project there was sure that Isolate sandbox for Linux environment will be used. There is no suitable general purpose sandbox on Windows platform, so main operation system of whole backend should be linux-based. Set of supported operations in Isolate seems reasonable for every sandbox, so most of its functionality is accessible from job configuration. As there is no other sandbox, naming often reflects Isolate's names. However worker is prepared to run on Windows too, so integrating with other sandboxes (as libraries or commandline tools) is possible. - -Isolate as sandbox provides wide scale of functionality which can be used to limit resources or even cut off particular resources from sandboxed program. There is of course basics like limiting cpu-time and memory consumption, but there can be found also wall-time (human perception of time) or extra-time which is extra limit added to other time limits to increase chance of successful exiting of sandboxed program. From other features there is limiting stack-size, redirection of stdin, stdout or stderr from/to a file. Worth of mentioning is also defining number of processes/threads which can be created or defining environment variables which are passed to sandboxed program. - -Chapter by itself is filesystem handling. Isolate uses mount kernel namespace to create "virtual" filesystem which will be mounted in sandboxed program. By default there are only few read-only files/directories mapped into sandbox (described in Isolate man-page). This can be of course changed by providing another numerous folders as isolate parameters. By default folders are mapped as read-only but Isolate has few access options which can be set to some mount point. - -#### Limit isolate boxes to particular cpu or memory node - -New feature in version 1.3 is possibility of limit Isolate box to one or more cpu or memory node. This functionality is provided by _cpusets_ kernel mechanism and is now integrated in isolate. It is allowed to set only `cpuset.cpus` and `cpuset.mems` which should be just fine for sandbox purposes. As kernel functionality further description can be found in manual page of _cpuset_ or in Linux documentation in section `linux/Documentation/cgroups/cpusets.txt`. As previously stated this settings can be applied for particular isolate boxes and has to be written in isolate configuration. Standard configuration path should be `/usr/local/etc/isolate` but it may depend on your installation process. Configuration of _cpuset_ in there is really simple and is described in example below. - -``` -box0.cpus = 0 # assign processor with ID 0 to isolate box with ID 0 -box0.mems = 0 # assign memory node with ID 0 -# if not set, linux by itself will decide where should -# the sandboxed programs run at -box2.cpus = 1-3 # assign range of processors to isolate box 2 -box2.mems = 4-7 # assign range of memory nodes -box3.cpus = 1,2,3 # assign list of processors to isolate box 3 -``` - -- **cpuset.cpus:** Cpus limitation will restrict sandboxed program only to processor threads set in configuration. On hyperthreaded processors this means that all virtual threads are assignable, not only the physical ones. Value can be represented by single number, list of numbers separated by commas or range with hyphen delimiter. -- **cpuset.mems:** This value is particularly handy on NUMA systems which has several memory nodes. On standard desktop computers this value should always be zero because only one independent memory node is present. As stated in `cpus` limitation there can be single value, list of values separated by comma or range stated with hyphen. - -### WrapSharp - -WrapSharp is sandbox for programs in C# written also in C#. We have written it as a proof of concept sandbox for using in Windows environment. However, it is not properly tested and integrated to the worker yet. Security audit should be done before using in production. After that, with just a little bit of effort integrating into worker there can be a running sandbox for C# programs on Windows system. - - ## Cleaner ### Description @@ -211,15 +96,3 @@ There is a bit of catch with cleaner service, to work properly, server filesyste Another possibility seems to be to update last modified timestamp when accessing the file. This timestamp is used in most major filesystems, so there are less issues with compatibility than last access timestamp. The modified timestamp then must be updated by workers at each access, for example using `touch` command or similar. Final decision on better of these ways will be made after practical experience of running production system. -### Configuration and usage - -#### Configuration items -- **cache-dir** -- directory which cleaner manages -- **file-age** -- file age in seconds which are considered outdated and will be deleted - -#### Example configuration -```{.yml} -cache-dir: "/tmp" -file-age: "3600" # in seconds -``` -