From b3dfd6b3b206d0c161d83f41a6f17b9f82fb27a2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20Kruli=C5=A1?= Date: Sun, 23 Aug 2020 14:29:53 +0200 Subject: [PATCH] Updated Installation (markdown) --- Installation.md | 509 +++++++++++++++++++++++++++++++----------------- 1 file changed, 325 insertions(+), 184 deletions(-) diff --git a/Installation.md b/Installation.md index 97d32ad..e2f6571 100644 --- a/Installation.md +++ b/Installation.md @@ -1,185 +1,326 @@ -# Installation - -Installation of whole ReCodEx solution is a very complex process. It is -recommended to have good unix skills with basic knowledge of project -architecture. - -There are a lot of different GNU/Linux distributions with different package -management, naming convention and version release policies. So it is impossible -to cover all of the possible variants. We picked one distribution, which is -fully supported by automatic installation script, but there are also steps for -manual installation of all components which should work on most of the Linux -distributions. - -The distribution of our choice is CentOS, currently in version 7. It is a well -known server distribution, derived from enterprise distribution from Red Hat, so -it is very stable and widely used system with long term support. There are -[EPEL](https://fedoraproject.org/wiki/EPEL) additional repositories from Fedora -project, which adds newer versions of some packages into CentOS, which allows us -to use current environment. Also, _rpm_ packages are much easier to build than -_deb_ packages (for example from Python sources). - -The big rival of CentOS in server distributions field is Debian. We are running -one instance of ReCodEx on Debian too. You need to use _testing_ repositories to -use some decent package versions. It is easy to mess your system easily, so -create file `/etc/apt/apt.conf` with content of `APT::Default-Release -"stable";`. After you add testing repos to `/etc/apt/sources.list`, you can -install packages from there like `$ sudo apt-get -t testing install gcc`. - -Some components are also capable of running in Windows environment. However -setting up Windows OS is a little bit of pain and it is not supposed to run -ReCodEx in this way. Only worker component may be needed to run on Windows, so -we are providing clickable installer including dependencies. Just for info, all -components should be able to run on Windows, only broker was not tested and may -require small tweaks to work properly. - -## Security - -One of the most important aspects of ReCodEx instance is security. It is crucial -to keep gathered data safe and not to allow unauthorized users modify restricted -pieces of information. Here is a small list of recommendations to keep running -ReCodEx instance safe. - -- Secure MySQL installation. The installation script does not do any security - actions, so please run at least `mysql_secure_installation` script on database - computer. -- Get HTTPS certificate and set it in Apache for web application and API. - Monitor should be proxied through the web server too with valid certificate. - You can get free DV certificate from [Let's - Encrypt](https://letsencrypt.org/). Do not forget to set up automatic - renewing! -- Hide broker, workers and fileserver behind firewall, private subnet or IPsec - tunnel. They are not required to be reached from public internet, so it is - better keep them isolated. -- Keep your server updated and well configured. For automatic installation of - security updates on CentOS system refer to `yum-cron` package. Configure SSH - and Apache to use only strong ciphers, some recommendations can be found - [here](https://bettercrypto.org/static/applied-crypto-hardening.pdf). -- Do not put actually used credentials on web, for example do not commit your - passwords (in Ansible variables file) on GitHub. -- Regularly check logs for anomalies. - -## Ansible installer - -DEPRECATED - Ansible installer is no longer working! - -For automatic installation is used a set of Ansible scripts. Ansible is one of -the best known and used tools for automatic server management. It is required -only to have SSH access to the server and Ansible installed on the client -machine. For further reading is supposed basic Ansible knowledge. For more info -check their [documentation](http://docs.ansible.com/ansible/intro.html). - -All Ansible scripts are located in _utils_ repository, _installation_ -[directory](https://github.com/ReCodEx/utils/tree/master/installation). Ansible -files are pretty self-describing, they can be also use as template for -installation to different systems. Before installation itself it is required to -edit two files -- set addresses of hosts and values of some variables. - -### Hosts configuration - -First, it is needed to set IP addresses of your computers. Common practice is to -have multiple files with definitions, one for development, another for -production for example. Example configuration is in _development_ file. Each -component of ReCodEx project can be installed on different server. Hosts can be -specified as hostnames or IP addresses, optionally with port of SSH after colon. - -Shorten example of hosts config: - -``` -[workers] -127.0.0.1:22 - -[broker] -127.0.0.1:22 - -[all:children] -workers -broker -``` - -### Variables - -Configurable variables are saved in _group_vars/all.yml_ file. Syntax is basic -key-value pair per line, separated by colon. Values with brief description: - -- _source_dir_ -- Directory, where to store all sources from GitHub. Defaults - `/opt/recodex`. -- _mysql_root_password_ -- Password of root user of MySQL database. Will be set - after installation and saved to `/root/.my.cnf` file. -- _mysql_recodex_username_ -- MySQL username for ReCodEx API access. -- _mysql_recodex_password_ -- Password for the user above. -- _admin_email_ -- Email of administrator. Used when configuring Apache - webserver. -- _recodex_hostname_ -- Hostname where the API and web app will be accessible. - For example "recodex.projekty.ms.mff.cuni.cz". -- _webapp_node_addr_ -- IP address of NodeJS server running web app. Defaults to - "127.0.0.1" and should not be changed. -- _webapp_node_port_ -- Port to above. -- _webapp_public_addr_ -- Public address, where web server for web app will - listen. Defaults to "*". -- _webapp_public_port_ -- Port to above. -- _webapp_firewall_ -- Open port for web app in firewall, values "yes" or "no". -- _webapi_public_endpoint_ -- Public URL when the API will be running, for - example "https://recodex.projekty.ms.mff.cuni.cz:4000/v1". -- _webapi_public_addr_ -- Public address, where web server for API will listen. - Defaults to "*". -- _webapi_public_port_ -- Port to above. -- _webapi_firewall_ -- Open port for API in firewall, values "yes" or "no". -- _database_firewall_ -- Open port for database in firewall, values "yes" or - "no". -- _broker_to_webapi_addr_ -- Address, where API can reach broker. Private one is - recommended. -- _broker_to_webapi_port_ -- Port to above. -- _broker_firewall_api_ -- Open above port in firewall, "yes" or "no". -- _broker_to_workers_addr_ -- Address, where workers can reach broker. Private - one is recommended. -- _broker_to_workers_port_ -- Port to above. -- _broker_firewall_workers_ -- Open above port in firewall, "yes" or "no". -- _broker_notifier_address_ -- URL (on API), where broker will send - notifications, for example - "https://recodex.projekty.ms.mff.cuni.cz/v1/broker-reports". -- _broker_notifier_port_ -- Port to above, should be the same as for API itself - (_webapi_public_port_) -- _broker_notifier_username_ -- Username for HTTP Authentication for reports -- _broker_notifier_password_ -- Password for HTTP Authentication for reports -- _monitor_websocket_addr_ -- Address, where WebSocket connection from monitor - will be available -- _monitor_websocket_port_ -- Port to above. -- _monitor_firewall_websocket_ -- Open above port in firewall, "yes" or "no". -- _monitor_zeromq_addr_ -- Address, where monitor will be available on ZeroMQ - socket for broker to receive reports. -- _monitor_zeromq_port_ -- Port to above. -- _monitor_firewall_zeromq_ -- Open above port in firewall, "yes" or "no". -- _fileserver_addr_ -- Address, where fileserver will serve files. -- _fileserver_port_ -- Port to above. -- _fileserver_firewall_ -- Open above port in firewall, "yes" or "no". -- _fileserver_username_ -- Username for HTTP Authentication for access the - fileserver. -- _fileserver_password_ -- Password for HTTP Authentication for access the - fileserver. -- _worker_cache_dir_ -- File cache storage for workers. Defaults to - "/tmp/recodex/cache". -- _worker_cache_age_ -- How long hold fetched files in worker cache, in seconds. -- _isolate_version_ -- Git tag of Isolate version worker depends on. - -### Installation itself - -With your computers installed with CentOS and configuration modified it is time -to run the installation. - -``` -$ ansible-playbook -i development recodex.yml -``` - -This command installs all components of ReCodEx onto machines listed in -_development_ file. It is possible to install only specified parts of project, -just use component's YAML file instead of _recodex.yml_. - -Ansible expects to have password-less access to the remote machines. If you have -not such setup, use options `--ask-pass` and `--ask-become-pass`. - - - +This tutorial describes a complete installation and configuration of ReCodEx on +CentOS 8 system as it was conducted in August 2020. Details of this description +may vary for other systems (RPM packages are released for CentOS and Fedora +only) and possibly also change in the future. + +We are trying to make this tutorial for linux noobs, but some admin skills are +still required. + +For more details about the individual modules, please see their readme pages. + +* [Web Application](https://github.com/ReCodEx/web-app) +* [Core and REST API](https://github.com/ReCodEx/api) +* [Broker](https://github.com/ReCodEx/broker) +* [Monitor](https://github.com/ReCodEx/monitor) +* [File Server](https://github.com/ReCodEx/fileserver) +* [Worker](https://github.com/ReCodEx/worker) +* [Cleaner](https://github.com/ReCodEx/cleaner) + + +## Prerequisites + +Before we get started, make sure that you are using a file system that supports +ACLs. If you are doing fresh install of modern distro, it should not be a problem. +Filesystems like `xfs` and `zfs` use ACLs always. Older filesystems like `ext4` +use ACLs unless you explicitly disable them. However, if you are using more +obscure FS, make sure ACLs are in place (ReCodEx will work without ACLs, but +all recodex-core CLI commands would have to be executed under `apache` user). + +After minimal installation of CentOS 8 (with enabled EPEL and Power Tools repos) +install the following: +* Apache 2.4 (httpd service), configure it, install SSL certificates, and open + firewall for HTTPS +* MariaDB (10.3 or newer); it is also recommended to secure the DB properly + (set root password etc.) +* PHP 7.3 or newer +* Node.js 12.x or newer (14.x recommended) + +A few tips for installing the database: +``` +# dnf install httpd mariadb-server +# mysql_secure_installation +``` + +You can do this by running `mysql -uroot -p` and then executing these SQLs: +``` +CREATE DATABASE `recodex`; +CREATE USER 'recodex'@'localhost' IDENTIFIED BY 'someSecretPasswordYouNeedToSetYourself'; +GRANT ALL PRIVILEGES ON `recodex`.* TO 'recodex'@'localhost'; +``` + +Configuring the PHP repository: +``` +# dnf install dnf-utils http://rpms.remirepo.net/enterprise/remi-release-8.rpm +# dnf module enable php:remi-7.3 +``` + +Installing Node.js: +``` +# dnf -y install curl +# curl -sL https://rpm.nodesource.com/setup_14.x | bash - +# dnf -y install nodejs +``` + +You may check the right version is installed by typing `node -v`, which should +report version of Node.js (starting with 14). + + + +## Install ReCodEx RPM Packages + +Although individual components may run on different servers, typical deployments +would install everything on a single server or split only the worker(s) to +a separate server (e.g., when ReCodEx is deployed in VM and workers need to run +on bare metal so they can provide more accurate measurements). + +The main part of ReCodEx is installed as follows: +``` +# dnf copr enable semai/ReCodEx +# dnf install recodex-core recodex-web recodex-broker recodex-monitor recodex-fileserver +``` + +The worker (and its utility cleaner) is installed thusly: +``` +# dnf install recodex-worker recodex-cleaner +``` + +**Please note, that if you install worker on another server, it is strongly +recommended to secure the connection between these two servers (by VPN or IPSec +tunnel).** + +If successful, the following systemd services are now available: +* `recodex-web` +* `recodex-broker` +* `recodex-monitor` +* `recodex-fileserver` + +The core API runs under web server (does not need a custom service) and workers +will be covered separately later. + +All these services should not be running right after installation. They need to +be **configured first** and then you can start and enable them: + +``` +systemctl start +systemctl enable +systemctl status +``` + +The last command should show status of the service, which should be running. + + +## Configure The Installation + +### Fileserver + +The fileserver runs under `mod_wsgi` in Apache, so its configuration is in +`/etc/httpd/conf.d/010-fileserver.conf`. There should be no need to edit this +config file. You need only to se the HTTP authentication credentials and match +them with credentials in core module config (`fileServer` > `auth` structure). + +The credentials are in `/etc/httpd/recodex_htpasswd`. You may set them using +`htpasswd`, an Apache CLI tool for generating auth config files. Do not forget +to restart your web server after you are done: +``` +#> systemctl restart httpd +``` + +### Broker + +Broker configuration is in `/etc/recodex/broker/config.yml`. The most important +thing you need to set up is the communication route to API (see `notifier` +structure). The `address` must point to API URL (as configured on HTTP server) +suffixed with `/v1/broker-reports`. The `username` and `password` must match +credentials in `broker` > `auth` structure in core module configuration. + +By default, broker listens on all interfaces (represented by `*` in `address` +fields) both for clients and for workers. Based on your deployment, it might +be advisable to restrict the listening for specific addresses, namely for +`127.0.0.1` if the core/workers run on the same machine. If you choose to change +the ports, do not forget to change them in core and workers configurations as +well. + + +### Monitor + +Monitor configuration is in `/etc/recodex/broker/config.yml`. Make sure the +`zeromq_uri` (address and port) matches configuration of broker (`monitor` +structure). + +The `websocket_uri` sets listening address and port where the web socket server +runs. Make sure the web server redirects all WebSocket requests here. For Apache +with proxy module, add the following lines to the site configuration (assuming +you did not change the configuration of the monitor): + +``` +ProxyPass /ws/ ws://127.0.0.1:4567/ +ProxyPass /ws ws://127.0.0.1:4567/ +``` + +### Core (API) + +Before getting started, make sure the core directory is duly configured in your +web server and the server is capable of executing PHP in this directory. + +The core module has configuration stored in `/etc/recodex/core-api/config.local.neon`. +This is perhaps the most important config, so let us go through it step by step. +Furthermore, it is necessary to invoke `/opt/recodex-core/cleaner` script after +every modification of the config file. The cleaner will purge internal Nette +caches with pre-generated PHP files (so they can be regenerated automatically +again). + +You may refer to already set parameters by using references. E.g., `%webapp.address%` +used in a string will actually insert the `webapp` > `address` parameter value (which +you need to setup in the first step of the following list). + +1. Set URL of web application and the API (`webapp` > `address` and `api` > `address`). +The API URL must correspond to outside address as perceived by clients (i.e., +the web application), especially when you use proxies or `mod_rewrite` in your +setup. + +2. Setup access manager. The `issuer` and `audience` should match your web app +domain. Furthermore, the `verificationKey` should be set to a long (possibly +random) secret string. This string is used to sign and verify security tokens. +Btw. if you need to invalidate all ReCodEx security tokens at once, just modify +this string (that will effectively sing everybody off, so everyone will need to +go through login process again). + +3. Configure `fileServer` connection. Under normal circumstances, you just need +to fill in the credentials you have stored in `/etc/httpd/recodex_htpasswd`. + +4. In the `broker` structure, `auth` must hold credentials that match those set +in broker configuration (`notifier` structure) and the `address` must provide +URL with TCP protocol pointing to clients interface of the broker. + +5. Monitor `address` must be set to the external address where the monitor is +listening for web sockets. If you used proxy pass as suggested in Monitor +configuration, the address should be something like +`wss://your.recodex.domain:443/ws`. The 443 port makes sure the initial +handshake is done in HTTPS manner by Apache. + +6. Setup generated URLs in notification `emails`. The `footerUrl` should be the +base URL of the web application. The `from` parameter configures the `From:` +field set in all notification mails. The `defaultAdminTo` should be a string or +an array of strings with email addresses where the error notifications will be +sent. Error emails may contain sensitive information so it is highly recommended +to send them to actual administrators of ReCodEx only. On the other hand, it is +a good idea to have more than one administrator to reduce the chance of +overlooking these failures. + +7. Set your SMTP configuration in the `mail` structure. SMTP is necessary so +the API can send notification emails. You may temporary use ReCodEx without +emails (setting `emails` > `debugMode` to `true`), but emails are required for +key features like resetting forgotten password. + +8. Although this is the last step, it is perhaps the most important one. Fill in +your database credentials of the `recodex` user (which you were supposed to +create at the very beginning) into `doctrine` configuration (Doctrine framework +is responsible for database interface in the core module). + +There are many more configuration parameters. For inspiration, you may take a +look in `config.neon` file, but always remember to edit the `config.local.neon` +which works as an override of `config.neon`. The `config.neon` file may be +updated in the future releases. However, the list + +Finally, you need to set up a database. Switch to `/opt/recodex-core` directory +(that is **important**) and execute +``` +# su -c 'php www/index.php migrations:migrate' recodex +``` +You may see warnings that some migrations did not execute any SQL statements, +which is all right since there are no data in the DB yet. However, the whole +migration process must not end with an error. + +**Note that the migration should be also executed after every `recodex-core` +package upgrade!** + + +### Web application frontend + +Web application has configuration in `/etc/recodex/web-app/env.json`. +The most important thing to configure is `API_BASE`, the external URL of the API +(as configured in Apache and in previous step). Also the +`PERSISTENT_TOKENS_KEY_PREFIX` might be important, if you are running multiple +installations of ReCodEx on single domain (this prefix is used for local storage +and cookies, so the data of different instances are prefixed with different +keys). + +If you want to enable public registration to ReCodEx (use with caution), set +`ALLOW_NORMAL_REGISTRATION` to true and do not forget to enable this feature in +API (`localRegistration` > `enabled`). + +The web application runs locally on Node.js server. The port is also configured +in `env.json`. If you use Apache as your http frontend, you may need to set up +a proxy for your web application: + +``` +ProxyPass / http://127.0.0.1:8080/ +ProxyPassReverse / http://127.0.0.1:8080/ +``` + + +### Finalization + +When all components are working together, consider switching the `logger` > `level` +from `debug` to `info` for broker and monitor (and do not forget to restart the services). + +It is also recommended that you fill in the initial data into the database: +``` +# su -c 'php www/index.php db:fill init' recodex +``` + +After executing the fill command the database will contain: +* Instance with administrator registered as local account with credentials user + name: `admin@admin.com`, password: `admin` +* Runtime environments which ReCodEx can handle +* Default single hardware group which might be used for workers +* Pipelines for runtime environments which can be used when building exercises + +To modify the data further, you might want to set up some database administration +tool. We are shipping the [Adminer](https://www.adminer.org/) along with core +module, so it should be directly available under `your-api-url/adminer`. If you +do not want to disable it, configure your HTTP server to deny access to +`www/adminer` folder. You may use [phpMyAdmin](https://www.phpmyadmin.net/) as +an alternative. + +Finally, there are several commands that should be executed periodically. +All commands are executed similarly as db commands we used earlier: +``` +php /opt/recodex-core/www/index.php command:name +``` + +The important commands are +* `notifications:assignment-deadlines` will send emails to students (who + actually allowed this in their configurations) about approaching deadlines. + This is the most important command as it is directly related to ReCodEx operations. + It is recommended to run this command every night. +* `db:cleanup:uploads` will remove old uploaded files +* `db:cleanup:localized-texts` will remove old texts of groups and exercises +* `db:cleanup:exercise-configs` will remove old exercise configs +* `db:cleanup:pipeline-configs` will remove old pipeline configs +* `users:remove-inactive` will soft-delete and anonymize users who are deemed + inactive (i.e., they have not verified their credentials for a period of time + that is set in core module config) +* `notifications:general-stats` will send an email to all administrators with + brief statistics about ReCodEx usage. It is recommended to run this command + once a week (e.g., during the night between Saturday and Monday). + +The frequency of cleanup commands depend on your system utilization. In +intensively utilized instances, it might be prudent to call cleanups at least +once a week. In case of mostly idle instances, cleanup once per month or even +once per year may be sufficient. + +One option is to create a script (or multiple scripts) and schedule their +execution in crontab. Do not forget to run these commands as `recodex` user. +For example, adding the following line in `/etc/crontab` will execute your +cleanup script every day at 3 AM under `recodex` user. +``` +0 3 * * * recodex /path/to/your/cleanup/script.sh +``` + +--- + +## Set-up Workers and Environments + +To be documented yet...