# Installation Installation of whole ReCodEx solution is a very complex process. It is recommended to have good unix skills with basic knowledge of project architecture. There are a lot of different GNU/Linux distributions with different package management, naming convention and version release policies. So it is impossible to cover all of the possible variants. We picked one distribution, which is fully supported by automatic installation script, but there are also steps for manual installation of all components which should work on most of the Linux distributions. The distribution of our choice is CentOS, currently in version 7. It is a well known server distribution, derived from enterprise distribution from Red Hat, so it is very stable and widely used system with long term support. There are [EPEL](https://fedoraproject.org/wiki/EPEL) additional repositories from Fedora project, which adds newer versions of some packages into CentOS, which allows us to use current environment. Also, _rpm_ packages are much easier to build than _deb_ packages (for example from Python sources). The big rival of CentOS in server distributions field is Debian. We are running one instance of ReCodEx on Debian too. You need to use _testing_ repositories to use some decent package versions. It is easy to mess your system easily, so create file `/etc/apt/apt.conf` with content of `APT::Default-Release "stable";`. After you add testing repos to `/etc/apt/sources.list`, you can install packages from there like `$ sudo apt-get -t testing install gcc`. Some components are also capable of running in Windows environment. However setting up Windows OS is a little bit of pain and it is not supposed to run ReCodEx in this way. Only worker component may be needed to run on Windows, so we are providing clickable installer including dependencies. Just for info, all components should be able to run on Windows, only broker was not tested and may require small tweaks to work properly. ## Ansible installer DEPRECATED - Ansible installer is no longer working! For automatic installation is used a set of Ansible scripts. Ansible is one of the best known and used tools for automatic server management. It is required only to have SSH access to the server and Ansible installed on the client machine. For further reading is supposed basic Ansible knowledge. For more info check their [documentation](http://docs.ansible.com/ansible/intro.html). All Ansible scripts are located in _utils_ repository, _installation_ [directory](https://github.com/ReCodEx/utils/tree/master/installation). Ansible files are pretty self-describing, they can be also use as template for installation to different systems. Before installation itself it is required to edit two files -- set addresses of hosts and values of some variables. ### Hosts configuration First, it is needed to set IP addresses of your computers. Common practice is to have multiple files with definitions, one for development, another for production for example. Example configuration is in _development_ file. Each component of ReCodEx project can be installed on different server. Hosts can be specified as hostnames or IP addresses, optionally with port of SSH after colon. Shorten example of hosts config: ``` [workers] 127.0.0.1:22 [broker] 127.0.0.1:22 [all:children] workers broker ``` ### Variables Configurable variables are saved in _group_vars/all.yml_ file. Syntax is basic key-value pair per line, separated by colon. Values with brief description: - _source_dir_ -- Directory, where to store all sources from GitHub. Defaults `/opt/recodex`. - _mysql_root_password_ -- Password of root user of MySQL database. Will be set after installation and saved to `/root/.my.cnf` file. - _mysql_recodex_username_ -- MySQL username for ReCodEx API access. - _mysql_recodex_password_ -- Password for the user above. - _admin_email_ -- Email of administrator. Used when configuring Apache webserver. - _recodex_hostname_ -- Hostname where the API and web app will be accessible. For example "recodex.projekty.ms.mff.cuni.cz". - _webapp_node_addr_ -- IP address of NodeJS server running web app. Defaults to "127.0.0.1" and should not be changed. - _webapp_node_port_ -- Port to above. - _webapp_public_addr_ -- Public address, where web server for web app will listen. Defaults to "*". - _webapp_public_port_ -- Port to above. - _webapp_firewall_ -- Open port for web app in firewall, values "yes" or "no". - _webapi_public_endpoint_ -- Public URL when the API will be running, for example "https://recodex.projekty.ms.mff.cuni.cz:4000/v1". - _webapi_public_addr_ -- Public address, where web server for API will listen. Defaults to "*". - _webapi_public_port_ -- Port to above. - _webapi_firewall_ -- Open port for API in firewall, values "yes" or "no". - _database_firewall_ -- Open port for database in firewall, values "yes" or "no". - _broker_to_webapi_addr_ -- Address, where API can reach broker. Private one is recommended. - _broker_to_webapi_port_ -- Port to above. - _broker_firewall_api_ -- Open above port in firewall, "yes" or "no". - _broker_to_workers_addr_ -- Address, where workers can reach broker. Private one is recommended. - _broker_to_workers_port_ -- Port to above. - _broker_firewall_workers_ -- Open above port in firewall, "yes" or "no". - _broker_notifier_address_ -- URL (on API), where broker will send notifications, for example "https://recodex.projekty.ms.mff.cuni.cz/v1/broker-reports". - _broker_notifier_port_ -- Port to above, should be the same as for API itself (_webapi_public_port_) - _broker_notifier_username_ -- Username for HTTP Authentication for reports - _broker_notifier_password_ -- Password for HTTP Authentication for reports - _monitor_websocket_addr_ -- Address, where WebSocket connection from monitor will be available - _monitor_websocket_port_ -- Port to above. - _monitor_firewall_websocket_ -- Open above port in firewall, "yes" or "no". - _monitor_zeromq_addr_ -- Address, where monitor will be available on ZeroMQ socket for broker to receive reports. - _monitor_zeromq_port_ -- Port to above. - _monitor_firewall_zeromq_ -- Open above port in firewall, "yes" or "no". - _fileserver_addr_ -- Address, where fileserver will serve files. - _fileserver_port_ -- Port to above. - _fileserver_firewall_ -- Open above port in firewall, "yes" or "no". - _fileserver_username_ -- Username for HTTP Authentication for access the fileserver. - _fileserver_password_ -- Password for HTTP Authentication for access the fileserver. - _worker_cache_dir_ -- File cache storage for workers. Defaults to "/tmp/recodex/cache". - _worker_cache_age_ -- How long hold fetched files in worker cache, in seconds. - _isolate_version_ -- Git tag of Isolate version worker depends on. ### Installation itself With your computers installed with CentOS and configuration modified it is time to run the installation. ``` $ ansible-playbook -i development recodex.yml ``` This command installs all components of ReCodEx onto machines listed in _development_ file. It is possible to install only specified parts of project, just use component's YAML file instead of _recodex.yml_. Ansible expects to have password-less access to the remote machines. If you have not such setup, use options `--ask-pass` and `--ask-become-pass`. ## Manual installation ### Monitor For monitor functionality there are some required packages. All of them are listed in _requirements.txt_ file in the repository and can be installed by `pip` package manager as ``` $ pip install -r requirements.txt ``` **Description of dependencies:** - zmq -- binding to ZeroMQ framework - websockets -- framework for communication over WebSockets - asyncio -- library for fast asynchronous operations - pyyaml -- parsing YAML configuration files - argparse -- parsing command line arguments Installation will provide you following files: - `/usr/bin/recodex-monitor` -- simple startup script located in PATH - `/etc/recodex/monitor/config.yml` -- configuration file - `/etc/systemd/system/recodex-monitor.service` -- systemd startup script - code files will be installed in location depending on your system settings, mostly into `/usr/lib/python3.5/site-packages/monitor/` or similar Systemd script runs monitor binary as specific _recodex_ user, so in `postinst` script user and group of this name are created. Also, ownership of configuration file will be granted to that user. - RPM distributions can make and install binary package. This can be done like this: - run command ``` $ python3 setup.py bdist_rpm --post-install ./install/postints ``` to generate binary `.rpm` package or download precompiled one from releases tab of monitor GitHub repository (it is architecture independent package) - install package using ``` # yum install ./dist/recodex-monitor--1.noarch.rpm ``` - Other Linux distributions can install cleaner straight ``` $ python3 setup.py install --install-scripts /usr/bin # ./install/postinst ``` #### Usage Preferred way to start monitor as a service is via systemd as the other parts of ReCodEx solution. - Running monitor is fairly simple: ``` # systemctl start recodex-monitor.service ``` - Current state can be obtained by ``` # systemctl status recodex-monitor.service ``` You should see green **Active (running)**. - Setting up monitor to be started on system startup: ``` # systemctl enable recodex-monitor.service ``` Alternatively monitor can be started directly from command line with specifying path to configuration file. Note that this command will not start monitor as a daemon. ``` $ recodex-monitor -c /etc/recodex/monitor/config.yml ``` ## Security One of the most important aspects of ReCodEx instance is security. It is crucial to keep gathered data safe and not to allow unauthorized users modify restricted pieces of information. Here is a small list of recommendations to keep running ReCodEx instance safe. - Secure MySQL installation. The installation script does not do any security actions, so please run at least `mysql_secure_installation` script on database computer. - Get HTTPS certificate and set it in Apache for web application and API. Monitor should be proxied through the web server too with valid certificate. You can get free DV certificate from [Let's Encrypt](https://letsencrypt.org/). Do not forget to set up automatic renewing! - Hide broker, workers and fileserver behind firewall, private subnet or IPsec tunnel. They are not required to be reached from public internet, so it is better keep them isolated. - Keep your server updated and well configured. For automatic installation of security updates on CentOS system refer to `yum-cron` package. Configure SSH and Apache to use only strong ciphers, some recommendations can be found [here](https://bettercrypto.org/static/applied-crypto-hardening.pdf). - Do not put actually used credentials on web, for example do not commit your passwords (in Ansible variables file) on GitHub. - Regularly check logs for anomalies.