You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

119 lines
5.7 KiB
Markdown

8 years ago
# Fileserver
The fileserver is a simple frontend to a disk storage space that contains auxiliary files for assignments, archives with job configuration and files submitted by users and evaluation results. These files are the only ones required for backend to run, so dedicated fileserver gives the possibility of testing backend separately. Also, one fileserver instance could be shared among multiple API instances (with the same broker), so common files does not need to be duplicated in each API instance.
One exception is that important files with character of database entry (but not stored in database due to size) are stored directly in filesystem of API server. But this fact does not devaluate benefit of separate fileserver. From security point of view, fileserver should be completely isolated from public internet to keep the data safe while API server must be public from its nature.
For a description of the communication protocol used by the frontend
and workers, see the [Communication](#communication) chapter.
8 years ago
## Description
The storage is implemented in Python, using the Flask web framework. This
particular implementation evolved from a simple mock fileserver we used in early
stages of development. It prooved to be very reliable, so we decided to keep fileserver
as separate component instead of integrating this functionality into main API.
### Internal storage structure
Fileserver stores its data in a configurable filesystem folder. This folder has
the following subfolders:
- `./submissions/<id>` -- folders that contain files submitted by users
(student's solutions to assignments). `<id>` is an identifier received from
the ReCodEx API.
- `./submission_archives/<id>.zip` -- ZIP archives of all submissions. These are
created automatically when a submission is uploaded. `<id>` is an identifier
of the corresponding submission.
- `./tasks/<subkey>/<key>` -- supplementary task files (e.g. test inputs and
outputs). `<key>` is a hash of the file content (sha-1 is used) and `<subkey>`
is its first letter (this is an attempt to prevent creating a flat directory
structure).
8 years ago
## Installation
8 years ago
To install and use the fileserver, it is necessary to have Python3 with `pip` package manager installed. It is needed to install the dependencies. From clonned repository run the following command:
```
$ pip install -r requirements.txt
```
8 years ago
That is it. Fileserver does not need any special installation. It is possible to build and install _rpm_ package or install it without packaging the same way as monitor, but it is only optional. The installation would provide you with script `recodex-fileserver` in you `PATH`. No systemd unit files are provided, because of the configuration and usage of fileserver component is much different to our other Python parts.
## Configuration and usage
8 years ago
There are several ways of running the ReCodEx fileserver. We will cover two
typical use cases.
### Running in development mode
8 years ago
For simple development usage, it is possible to run the fileserver in the command
line. Allowed options are described below.
```
usage: fileserver.py [--directory WORKING_DIRECTORY]
{runserver,shell} ...
```
- **runserver** argument starts the Flask development server (i.e. `app.run()`). As additional argument can be given a port number.
- **shell** argument instructs Flask to run a Python shell inside application context.
Simple development server on port 9999 can be run as
```
$ python3 fileserver.py runserver 9999
```
When run like this command, the fileserver creates a temporary directory where it stores all the files and which is deleted when it exits.
### Running as WSGI script in a web server
If you need features such as HTTP authentication (recommended) or efficient serving of static
files, it is recommended to run the app in a full-fledged web server (such as
Apache or Nginx) using WSGI. Apache configuration can be generated by `mkconfig.py` script from the repository.
```
usage: mkconfig.py apache [-h] [--port PORT] --working-directory
WORKING_DIRECTORY [--htpasswd HTPASSWD]
[--user USER]
```
- **port** -- port where the fileserver should listen
- **working_directory** -- directory where the files should be stored
8 years ago
- **htpasswd** -- path to user file for HTTP Basic Authentication
- **user** -- user under which the server should be run
### Running using uWSGI
Another option is to run fileserver as a standalone app via uWSGI service. Setup is also quite simple, configuration file can be also generated by `mkconfig.py` script.
1. (Optional) Create a user for running the fileserver
2. Make sure that your user can access your clone of the repository
3. Run `mkconfig.py` script.
```
usage: mkconfig.py uwsgi [-h] [--user USER] [--port PORT]
[--socket SOCKET]
--working-directory WORKING_DIRECTORY
```
- **user** -- user under which the server should be run
- **port** -- port where the fileserver should listen
- **socket** -- path to UNIX socket where the fileserver should listen
- **working_directory** -- directory where the files should be stored
4. Save the configuration file generated by the script and run it with uWSGI,
either directly or using systemd. This depends heavily on your distribution.
5. To integrate this with another web server, see the [uWSGI
documentation](http://uwsgi-docs.readthedocs.io/en/latest/WebServers.html)
8 years ago
Note that the ways distributions package uWSGI can vary wildly. In Debian 8 it is
necessary to convert the configuration file to XML and make sure that the
python3 plugin is loaded instead of python. This plugin also uses Python 3.4,
even though the rest of the system uses Python 3.5 - make sure to install
dependencies for the correct version.