You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

29 KiB

This tutorial describes a complete installation and configuration of ReCodEx on CentOS 8 system as it was conducted in August 2020. Details of this description may vary for other systems (RPM packages are released for CentOS and Fedora only) and possibly also change in the future.

We are trying to make this tutorial for linux noobs, but some admin skills are still required.

For more details about the individual modules, please see their readme pages.

Btw. there used to be a File Server module as well, but we got the file management integrated in Core module.

Prerequisites

Before we get started, make sure that you are using a file system that supports ACLs. If you are doing fresh install of modern distro, it should not be a problem. Filesystems like xfs and zfs use ACLs always. Older filesystems like ext4 use ACLs unless you explicitly disable them. However, if you are using more obscure FS, make sure ACLs are in place (ReCodEx will work without ACLs, but all recodex-core CLI commands would have to be executed under apache user).

After minimal installation of CentOS 8 (with enabled EPEL and Power Tools repos) install the following:

  • Apache 2.4 (httpd service), configure it, install SSL certificates, and open firewall for HTTPS
  • MariaDB (10.3 or newer); it is also recommended to secure the DB properly (set root password etc.)
  • PHP 7.3 (actually this version needs to be exact, we have not fixed issues of 7.4 yet)
  • Node.js 12.x or newer (14.x recommended)

A few tips for installing the database:

# dnf install httpd mariadb-server
# mysql_secure_installation

You can do this by running mysql -uroot -p and then executing these SQLs:

CREATE DATABASE `recodex`;
CREATE USER 'recodex'@'localhost' IDENTIFIED BY 'someSecretPasswordYouNeedToSetYourself';
GRANT ALL PRIVILEGES ON `recodex`.* TO 'recodex'@'localhost';

Configuring the PHP repository:

# dnf install dnf-utils http://rpms.remirepo.net/enterprise/remi-release-8.rpm
# dnf module enable php:remi-7.3

Installing Node.js:

# dnf -y install curl
# curl -sL https://rpm.nodesource.com/setup_14.x | bash -
# dnf -y install nodejs

You may check the right version is installed by typing node -v, which should report version of Node.js (starting with 14).

Install ReCodEx RPM Packages

Although individual components may run on different servers, typical deployments would install everything on a single server or split only the worker(s) to a separate server (e.g., when ReCodEx is deployed in VM and workers need to run on bare metal so they can provide more accurate measurements).

The main part of ReCodEx is installed as follows:

# dnf copr enable semai/ReCodEx
# dnf install recodex-core recodex-web recodex-broker recodex-monitor

The worker (and its utility cleaner) is installed thusly:

# dnf install recodex-worker recodex-cleaner

If successful, the following systemd services are now available:

  • recodex-web
  • recodex-broker
  • recodex-monitor

The core API runs under web server (does not need a custom service) and workers will be covered separately later.

All these services should not be running right after installation. They need to be configured first and then you can start and enable them:

systemctl start <service-name>
systemctl enable <service-name>
systemctl status <service-name>

The last command should show status of the service, which should be running.

Configure The Installation

Broker

Broker configuration is in /etc/recodex/broker/config.yml. The most important thing you need to set up is the communication route to API (see notifier structure). The address must point to API URL (as configured on HTTP server) suffixed with /v1/broker-reports. The username and password must match credentials in broker > auth structure in core module configuration.

By default, broker listens on all interfaces (represented by * in address fields) both for clients and for workers. Based on your deployment, it might be advisable to restrict the listening for specific addresses, namely for 127.0.0.1 if the core/workers run on the same machine. If you choose to change the ports, do not forget to change them in core and workers configurations as well.

Monitor

Monitor configuration is in /etc/recodex/broker/config.yml. Make sure the zeromq_uri (address and port) matches configuration of broker (monitor structure).

The websocket_uri sets listening address and port where the web socket server runs. Make sure the web server redirects all WebSocket requests here. For Apache with proxy module, add the following lines to the site configuration (assuming you did not change the configuration of the monitor):

ProxyPass /ws/ ws://127.0.0.1:4567/
ProxyPass /ws ws://127.0.0.1:4567/

Core (API)

Before getting started, make sure the core directory is duly configured in your web server and the server is capable of executing PHP in this directory.

The core module has configuration stored in /etc/recodex/core-api/config.local.neon. This is perhaps the most important config, so let us go through it step by step. Furthermore, it is necessary to invoke /opt/recodex-core/cleaner script after every modification of the config file. The cleaner will purge internal Nette caches with pre-generated PHP files (so they can be regenerated automatically again).

You may refer to already set parameters by using references. E.g., %webapp.address% used in a string will actually insert the webapp > address parameter value (which you need to setup in the first step of the following list).

  1. Set URL of web application and the API (webapp > address and api > address). The API URL must correspond to outside address as perceived by clients (i.e., the web application), especially when you use proxies or mod_rewrite in your setup.

  2. Setup access manager. The issuer and audience should match your web app domain. Furthermore, the verificationKey should be set to a long (possibly random) secret string. This string is used to sign and verify security tokens. Btw. if you need to invalidate all ReCodEx security tokens at once, just modify this string (that will effectively sing everybody off, so everyone will need to go through login process again).

  3. Configure fileStorage paths. The storages are basically directories managed by the core component. It is divided into two parts -- hash storage that uses hashes as file names for data deduplication and local storage for regular files. Note that both directories must be created manually and you need to make them both readable and writeable by apache user and recodex user (use fs ACLs). Recommended place to store the data would be under /var/recodex-filestorage where you can put hash and local subdirs.

  4. In the broker structure, auth must hold credentials that match those set in broker configuration (notifier structure) and the address must provide URL with TCP protocol pointing to clients interface of the broker.

  5. Monitor address must be set to the external address where the monitor is listening for web sockets. If you used proxy pass as suggested in Monitor configuration, the address should be something like wss://your.recodex.domain:443/ws. The 443 port makes sure the initial handshake is done in HTTPS manner by Apache.

  6. Let workers access the API to exchange files. The workerFiles modules needs to be enabled and you need to set the secret auth credentials which are also set in worker configuration.

  7. Setup generated URLs in notification emails. The footerUrl should be the base URL of the web application. The from parameter configures the From: field set in all notification mails. The defaultAdminTo should be a string or an array of strings with email addresses where the error notifications will be sent. Error emails may contain sensitive information so it is highly recommended to send them to actual administrators of ReCodEx only. On the other hand, it is a good idea to have more than one administrator to reduce the chance of overlooking these failures.

  8. Set your SMTP configuration in the mail structure. SMTP is necessary so the API can send notification emails. You may temporary use ReCodEx without emails (setting emails > debugMode to true), but emails are required for key features like resetting forgotten password.

  9. Although this is the last step, it is perhaps the most important one. Fill in your database credentials of the recodex user (which you were supposed to create at the very beginning) into doctrine configuration (Doctrine framework is responsible for database interface in the core module).

There are many more configuration parameters. For inspiration, you may take a look in config.neon file, but always remember to edit the config.local.neon which works as an override of config.neon. The config.neon file may be updated in the future releases. However, the list

Finally, you need to set up a database. Switch to /opt/recodex-core directory (that is important) and execute

# su -c './bin/console migrations:migrate' recodex

You may see warnings that some migrations did not execute any SQL statements, which is all right since there are no data in the DB yet. However, the whole migration process must not end with an error.

Note that the migration should be also executed after every recodex-core package upgrade!

Web application frontend

Web application has configuration in /etc/recodex/web-app/env.json. The most important thing to configure is API_BASE, the external URL of the API (as configured in Apache and in previous step). Also the PERSISTENT_TOKENS_KEY_PREFIX might be important, if you are running multiple installations of ReCodEx on single domain (this prefix is used for local storage and cookies, so the data of different instances are prefixed with different keys).

If you want to enable public registration to ReCodEx (use with caution), set ALLOW_NORMAL_REGISTRATION to true and do not forget to enable this feature in API (localRegistration > enabled).

The web application runs locally on Node.js server. The port is also configured in env.json. If you use Apache as your http frontend, you may need to set up a proxy for your web application:

ProxyPass / http://127.0.0.1:8080/
ProxyPassReverse / http://127.0.0.1:8080/

Finalization

When all components are working together, consider switching the logger > level from debug to info for broker and monitor (and do not forget to restart the services).

It is also recommended that you fill in the initial data into the database:

# su -c './bin/console db:fill init' recodex

After executing the fill command the database will contain:

  • Instance with administrator registered as local account with credentials user name: admin@admin.com, password: admin
  • Runtime environments which ReCodEx can handle
  • Default single hardware group which might be used for workers
  • Pipelines for runtime environments which can be used when building exercises

To modify the data further, you might want to set up some database administration tool. We are shipping the Adminer along with core module, so it should be directly available under your-api-url/adminer. If you do not want to disable it, configure your HTTP server to deny access to www/adminer folder. You may use phpMyAdmin as an alternative.

Finally, there are several commands that should be executed periodically. All commands are executed similarly as db commands we used earlier:

/opt/recodex-core/bin/console command:name

The important commands are

  • notifications:assignment-deadlines will send emails to students (who actually allowed this in their configurations) about approaching deadlines. This is the most important command as it is directly related to ReCodEx operations. It is recommended to run this command every night.
  • fs:cleanup:worker will remove old files that are exchanged between core api and worker backend
  • db:cleanup:uploads will remove old uploaded files
  • db:cleanup:localized-texts will remove old texts of groups and exercises
  • db:cleanup:exercise-configs will remove old exercise configs
  • db:cleanup:exercise-files will remove unused files (attachments and test files) of deleted exercises/assignments and pipelines
  • db:cleanup:pipeline-configs will remove old pipeline configs
  • users:remove-inactive will soft-delete and anonymize users who are deemed inactive (i.e., they have not verified their credentials for a period of time that is set in core module config)
  • notifications:general-stats will send an email to all administrators with brief statistics about ReCodEx usage. It is recommended to run this command once a week (e.g., during the night between Saturday and Monday).

The frequency of cleanup commands depend on your system utilization. In intensively used instances, it might be prudent to call cleanups as often as once a week. In case of mostly idle instances, a cleanup per month or even per year may be sufficient. Calling fs:cleanup:worker command should be more frequent (once a day is recommended).

One option is to create a script (or multiple scripts) and schedule their execution in crontab. Do not forget to run these commands as recodex user. For example, adding the following line in /etc/crontab will execute your cleanup script every day at 3 AM.

0 3 * * *	recodex /path/to/your/cleanup/script.sh

Setup Workers and Environments

Before you get started: If you want to use disk quotas (recommended), you will need a FS that supports quotas. You will probably need to enable them in fstab adding usrquota,grpquota to mount parameters (do not forget to remount/reboot). After that, you should install quota tool and activate quotas (something like this):

#> yum install quotatool
#> mount -o remount /
#> quotacheck -mavug
#> quotaon --all

Worker configuration

Worker is ready to be executed in multiple instances. Each instance has config file /etc/recodex/worker/config-%i.yml, where %i is the numeric ID of the worker (first one has ID = 1). Make sure that you have a config file for each worker you want to start; however, it might be good idea to configure one worker (make sure it is running properly) and then use the first config as template for others. If you are managing many workers, some macro-preprocessing tool may be useful to manage their configurations. Each instance needs to be enabled and started (after config file is ready) as (replace 1 with other IDs for other workers):

# systemctl enable recodex-worker@1
# systemctl start recodex-worker@1
# systemctl status recodex-worker@1

Before starting the worker service, edit the config file first. The worker-id (and optionally) worker-description distinguish individual workers in case you run multiple workers on the same machine. It is highly recommended that these IDs match the IDs of systemd services (which are also embedded in config file names).

The worker needs broker and file server to operate. Update broker-uri so it matches your broker location and port designated to workers. The file-managers structure configures the file server access provided by core API module (hostname has to be set to https URL pointing to API and HTTP auth credentials must match credentials set in workerFiles section of core configuratuion).

Create worker(s) working directory (e.g., /var/recodex-worker-wd) and cache directory (e.g., /var/recodex-worker-cache) and set their paths to working-directory and file-cache > cache-dir properties respectively. Both directories must be owned (and writeable) by the user, under which the worker runs (typically recodex). If you run multiple workers on one machine, these directories may be shared (recommended). On the other hand, multiple workers should not share log, so update logger > file so it holds unique file for each worker.

The hwgroup holds an ID of hardware group -- a group of workers with the same capabilities (i.e., running on the same hardware with the same system configuration). This is by default set to the only hwgroup present in init fixtures used to initialize the database for the first time. Optionally, you might want to restrict runtime environments the worker supports (headers

env). The IDs correspond to IDs in the database and the default config file holds all specified runtimes in the initial database fill.

Based on supported runtime environments, it might be necessary to update the configuration of sandbox (limits structure) -- namely the pre-set environmental variables and mapped directories. For instance, when using java runtime, a mapping for JDK directory needs to be added to bound-directories (src refers to directory on the real file systems, dst specifies, where the directory will be mounted in the sandbox), and JAVA_HOME variable holding the path (dst) needs to be added to environ-variable list. Be careful when editing PATH or LD_LIBRARY_PATH environmental variables as they may apply to multiple runtimes.

After you set all runtime environments successfully (will be explained below), it is recommended to set cleanup-submission to true, so that worker removes old data from its working directory when each evaluation concludes.

Isolate configuration

The worker uses sandbox isolate. The sandbox is installed automatically, if you installed worker as RPM package (otherwise you need to compile it manually). It has configuration in /etc/isolate/default.cf. If you are running multiple workers (or other services) on the hardware, where the testing will take place, it might be a good idea to configure sandbox CPU affinity here, so that individual workers will not share CPU cores. For example:

box1.cpus = 0
box2.cpus = 1
box3.cpus = 2-7

Will configure box1 and box2 (which correspond to workers with IDs 1 and 2) as single-core boxes bound to the first and the second CPU and the box3 will be a multicore box occupying the following six cores.

For greater precision, it is better not to utilize the entire CPU (all CPUs). Furthermore, we recommend turning off hyperthreading or multithreading feature. The best option is when a sandbox occupies one socket alone, but that might be a waste if you are using CPU dies with many cores (consider that when planning the purchase of your hardware).

Cleaner configuration

The cleaner module removes old records from worker cache. It is a separate module since there may be multiple workers using the same cache.

After installing, edit the /etc/recodex/cleaner/config.yml. The most important parameter is cache-dir which holds the path to worker cache directory (i.e., /var/recodex-worker-cache if you followed the previous instructions). Additionally, you may set the expiration time and logging properties.

Although recodex-cleaner is also registered as systemd service, you need to enable and start the recodex-cleaner.timer service, which runs the cleaner periodically (once a day). An alternative is to add cleaner to cron, but the systemd timer is the recommended way.

Runtime environments

Some runtimes require access to /etc directory since their compilers or interprets have their configuration there (e.g., freepascal or PHP). It could be a security risk to map entire /etc into sandbox (although, some environments are having /etc mapped anyways, but only for the compilation step). Furthermore, it might be helpful to have separate configuration files for the system and for the sandbox. One possible solution is to prepare separated config directory (e.g., /usr/etc) and place configs for the sandbox there (and add it to bound-directories list).

Runtimes bash and data-linux should work out of the box. Other runtimes will require additional installations and configurations:

C and C++

Simply install the GCC compiler.

dnf -y install gcc gcc-c++

C# and .NET Core

We are currently migrating configuration from Mono to .NET Core. This configuration applies to cs-dotnet-core environment, mono is considered deprecated.

Install .NET core SDK:

# dnf -y install dotnet-sdk-3.1

DOTNET_ROOT: /usr/lib64/dotnet DOTNET_BUNDLE_EXTRACT_BASE_DIR: /box/.dotnet-bundle_extract

Download the following files:

https://raw.githubusercontent.com/ReCodEx/utils/master/runners/cs/Reader.cs
https://raw.githubusercontent.com/ReCodEx/utils/master/runners/cs/Wrapper.cs
https://raw.githubusercontent.com/ReCodEx/utils/master/runners/cs/recodex.csproj

Open your running ReCodEx instance in web browser and log in as administrator. Open Pipeline page, find the C# .NET Core Compilation pipeline, and click on the Edit button on the right. Scroll down to Supplementary files box and upload all three downloaded files here.

Free Pascal

First you need to install Free Pascal compiler:

# dnf -y install fpc

Copy /etc/fpc.cfg to /usr/etc/fpc.cfg (or your sandbox-only etc directory). Set PPC_CONFIG_PATH environmental variable (environ-variable list) to /usr/etc and make sure /usr/etc is mapped to sandbox as read-only (as explained at the beginning of runtime environments section).

Go

Simply install the Go language package.

# dnf -y install golang

Groovy

This environment requires Java runtime to be installed as well (please do before venturing forth).

Download latest stable binary of Apache Groovy from https://groovy.apache.org/download.html (e.g., https://dl.bintray.com/groovy/maven/apache-groovy-binary-3.0.5.zip) And unzip the content into /opt/groovy.

Add environmental variable in worker configuration (environ-variable list) GROOVY_HOME with value /opt/groovy and enable access to this directory by adding it into bound-directories.

Create a symlink /opt/groovy/groovy.jar referring to /opt/groovy/libs/groovy-3.0.5.jar (where 3.0.5 is the actual version of downloaded Groovy).

Create the following symlinks in /usr/bin, each referring to the binary of the same name in /opt/groovy/bin: groovy, groovyc, groovyConsole, groovydoc, groovysh.

Haskell

Simply install the Haskell compiler.

# dnf -y install ghc

Java

Install latest OpenJDK including java compiler (packages are available in EPEL):

# dnf -y install java-latest-openjdk java-latest-openjdk-devel

Download the runner source file

https://raw.githubusercontent.com/ReCodEx/utils/master/runners/java/javarun.java

Compile the source file into class file (javac ./javarun.java). Open your running ReCodEx instance in web browser and log in as administrator. Open Pipeline page and do the following with both Java execution pipelines (Java execution & evaluation [outfile] and Java execution & evaluation [stdout]): Click on Edit button on the right, scroll down to Supplementary files box and upload the compiled javarun.class.

Kotlin

This environment requires Java runtime to be installed as well (please do before venturing forth).

Download latest Kotlin compiler release from GitHub and unzip it to /opt/kotlin.

Create a symlink /usr/bin/kotlinc referring to /opt/kotlin/bin/kotlinc and make sure kotlinc have executable flag set.

Make the /opt/kotlin directory accessible by the sandbox (add it to bound-directories list).

Node.js (JavaScript)

Install Node.js (if you are running the worker on the same machine as frontend, you have already done this).

# dnf -y install curl
# curl -sL https://rpm.nodesource.com/setup_14.x | bash -
# dnf -y install nodejs

You may check the right version is installed by typing node -v, which should report version of Node.js (starting with 14).

PHP

Install PHP (if you are running the worker on the same machine as frontend, you have already done this).

# dnf install dnf-utils http://rpms.remirepo.net/enterprise/remi-release-8.rpm
# dnf module enable php:remi-7.3
# dnf install php-cli

Copy php.ini to /usr/etc and update it (especially make sure that all needed modules are loaded explicitly). Also make sure the /usr/etc is mapped to sandbox as suggested at the beginning of this section.

Optionally, you might want to consider installing User Operations for Zend (uopz) PECL package for PHP. This package could help you creating hooks or mock existing functions, which is helpful when testing PHP assignments. If you enable this module, it is important to re-allow the source code control over the exit opcode by adding

uopz.exit = 1

Into /usr/etc/php.ini.

If you share PHP installation between core API and worker, make sure the uopz extension is either disabled or allows the exit code override for the API module!!!

Prolog

Unfortunately, SWI Prolog has to be compiled manually. Do not forget to have EPEL 8 and PowerTools repositories enabled.

Install dependencies:

# dnf -y install \
  gcc \
  gcc-c++ \
  cmake3 \
  ninja-build \
  libunwind \
  freetype-devel \
  gmp-devel \
  java-1.8.0-openjdk-devel \
  jpackage-utils \
  libICE-devel \
  libjpeg-turbo-devel \
  libSM-devel \
  libX11-devel \
  libXaw-devel \
  libXext-devel \
  libXft-devel \
  libXinerama-devel \
  libXmu-devel \
  libXpm-devel \
  libXrender-devel \
  libXt-devel \
  ncurses-devel \
  openssl-devel \
  pkgconfig \
  readline-devel \
  libedit-devel \
  unixODBC-devel \
  zlib-devel \
  uuid-devel \
  libarchive-devel \
  libyaml-devel

Download SWI-Prolog 8.0.1. and compile it (it is recommended to do this in dedicated directory as regular user):

$ wget http://www.swi-prolog.org/download/stable/src/swipl-8.0.1.tar.gz
$ tar xvf ./swipl-8.0.1.tar.gz
$ cd ./swipl-8.0.1
$ mkdir build
$ cd ./build
$ cmake3 ..
$ make
$ sudo make install

For technical reasons, the ReCodEx requires that the swipl is present in /usr/bin. The simplest way is to create symlink /usr/bin/swipl that refers to /usr/local/lib/swipl/bin/x86_64-linux/swipl.

Download recodex-init.pl, recodex-swipl-wrapper.sh, and recodex-wrapper.pl from

https://github.com/ReCodEx/utils/tree/master/runners/prolog-compilation

Open your running ReCodEx instance in web browser and log in as administrator. Open Pipeline page, find the Prolog Compilation pipeline, and click on the Edit button on the right. Scroll down to Supplementary files box and upload all three downloaded files here.

Python

Python is executed in a wrapper script that handles exceptions and translates them in exit codes (which is necessary for error reporting). Download

https://raw.githubusercontent.com/ReCodEx/utils/master/runners/py/runner.py

Open your running ReCodEx instance in web browser and log in as administrator. Open Pipeline page and do the following with both Python pipelines (Python execution & evaluation [outfile] and Python execution & evaluation [stdout]): Click on Edit button on the right, scroll down to Supplementary files box and upload the runner.py file.

Then we need to install and configure workers:

# dnf install python38
# python3.8 -m venv /var/recodex-worker-python-venv

Add the following variables to environ-variable list:

PYTHONHASHSEED: 0
PYTHONIOENCODING: utf-8
VIRTUAL_ENV: /var/recodex-worker-python-venv

and update the PATH variable by prepending /var/recodex-worker-python-venv/bin:.

Finally, register directory mapping in bound-directories list:

- src: "/var/recodex-worker-python-venv"

Rust

Simply install the Rust compiler:

# dnf -y install rust

Scala

Simply install Scala runtime and compiler:

# dnf -y install scala

Please note, that Scala requires Java runtime to work. In CentOS 8, Scala currently installs Java-8 as a dependency, but Java runtime requires at least Java 11 to work properly. However, if you install both Java's, you can easily configure the system to use latest java as default:

# alternatives --config java

Are we done?

Not quite yet. There are at least two things you should consider.

Backup

The data are stored at two places -- in the database and in the file storage. The database can be easily dumped thusly:

mysqldump --default-character-set=utf8mb4 -uroot -p recodex > /path/to/backup/file.sql

Where recodex is the database name.

To backup Furthermore, you need to backup the filestorage hash and local directories (which you may have located at /var/recodex-fileserver as suggested), preferably using tools like rsync.

It might be a good idea to perform a backup every night and to keep several last copies. For instance, in our pilot setup, we keep last 7 daily backups, all backups made on the 1st of every month for the last year, and all backups made on January 1st of every year.

Monitoring

For mission-critical systems, some form of monitoring is a must. Depending on your needs, the monitoring could be as simple as setting up a script that uses ping or wget to verify the ReCodEx is running, or you can use more sophisticated tools.

For our main instance, we use Prometheus with node_exporter and mysqld_exporter to gather performance statistics and Grafana to visualize them.

Are we done now?!

Yes, we are. Enjoy and handle with care.