From bcf835cfa1ac874f7586a3172ef83dc3b018a23a Mon Sep 17 00:00:00 2001
From: Martin Polanka <PolankaMartin@gmail.com>
Date: Mon, 13 Mar 2017 11:27:35 +0100
Subject: [PATCH] new hone

---
 Home.md           |   11 +
 Rewritten-docs.md | 4063 ---------------------------------------------
 2 files changed, 11 insertions(+), 4063 deletions(-)
 delete mode 100644 Rewritten-docs.md

diff --git a/Home.md b/Home.md
index 1e4076d..5863c90 100644
--- a/Home.md
+++ b/Home.md
@@ -3,6 +3,17 @@
 
 ## Contents
 * [[Home]]
+* [[Introduction]]
+* [[Analysis]]
+* [[User Documentation]]
+* [[Implementation]]
+* [[Conclustion]]
+
+### Appendices
+* [[Installation]]
+* [[System Configuration]]
+* [[Job Configuration]]
+* [[Database]]
 
 ## Separated pages
 * [[Logo]]
diff --git a/Rewritten-docs.md b/Rewritten-docs.md
deleted file mode 100644
index 7d0eeae..0000000
--- a/Rewritten-docs.md
+++ /dev/null
@@ -1,4063 +0,0 @@
-# Introduction
-
-In general there are many different ways and opinions on how to teach people
-something new. However, most people agree that a hands-on experience is one of
-the best ways to make the human brain remember a new skill. Learning must be
-entertaining and interactive, with fast and frequent feedback. Some areas
-are more suitable for this practical way of learning than others, and
-fortunately, programming is one of them.
-
-University education system is one of the areas where this knowledge can be
-applied. In computer programming, there are several requirements a program
-should satisfy, such as the code being syntactically correct, efficient and easy
-to read, maintain and extend. 
-
-Checking programs written by students by hand takes time and requires a lot of
-repetitive work -- reviewing source codes, compiling them and
-running them through test scenarios. It is therefore desirable to automate as
-much of this process as possible. 
-
-The first idea of an automatic evaluation system
-comes from Stanford University professors in 1965. They implemented a system
-which evaluated code in Algol submitted on punch cards. In following years, many
-similar products were written.
-
-Nowadays properties like correctness and efficiency can be tested
-to a large extent automatically. This fact should be exploited to help teachers
-save time for tasks such as examining bad design, bad coding habits, or logical
-mistakes, which are difficult to perform automatically.
-
-There are two basic ways of automatically evaluating code:
-
-- **statically** -- by checking the source code without running it. 
-  This is safe, but not very practical.
-- **dynamically** -- by running the code on test inputs and checking the correctness of
-  outputs ones. This provides good real world experience, but requires extensive
-  security measures).
-
-This project focuses on the machine-controlled part of source code evaluation.
-First we observed the general concepts of grading systems and discussed the problems of the
-software previously used at the Charles University in Prague.
-Then the new requirements were specified and we examined projects with similar functionality.
-With the acquired knowledge from these projects, we set up
-goals for the new evaluation system, designed the architecture and implemented a
-fully operational solution based on dynamic evaluation. The system is now ready
-for production testing at the university.
-
-## Assignment
-
-The major goal of this project is to create a grading application which will be
-used for programming classes at the Faculty of Mathematics and Physics of the
-Charles University in Prague. However, the application should be designed in a
-modular fashion to be easily extended or even modified to make other ways of
-usage possible.
-
-The system should be capable of doing a dynamic analysis of the submitted source
-codes. This consists of the following basic steps:
-
-1. compile the code and check for compilation errors
-2. run the compiled program in a sandbox with predefined inputs
-3. check the constraints of the amount of used memory and time
-4. compare the outputs of the program with the defined expected outputs
-5. award the solution with a numeric score 
-
-The whole system is intended to help both the teachers (supervisors) and the students.
-To achieve this, it is crucial for us to keep in mind the typical usage scenarios of
-the system and to try to make these tasks as simple as possible. To fulfill this
-task, the project has a great starting point -- there is an old grading system
-currently used at the university (CodEx), so its flaws and weaknesses can be
-addressed. Furthermore, many teachers desire to use and test the new system and
-they are willing to consult our ideas or problems during the development with us.
-
-## Current System
-
-The grading solution currently used at the Faculty of Mathematics and Physics of
-the Charles University in Prague was implemented in 2006 by a group of students.
-It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
-and it has been used with some improvements since then. The original plan was to
-use the system only for the basic programming courses, but there was a demand for
-adapting it for several different courses.
-
-CodEx is based on dynamic analysis. It features a web-based interface, where
-supervisors can assign exercises to their students and the students have a time
-window to submit their solutions. Each solution is compiled and run in sandbox
-(MO-Eval). The metrics which are checked are: correctness of the output, time
-and memory limits. It supports programs written in C, C++, C#, Java, Pascal,
-Python and Haskell.
-
-The system has a database of users. Each user is assigned a role, which
-corresponds to his/her privileges. There are user groups reflecting the
-structure of lectured courses.
-
-A database of exercises (algorithmic problems) is another part of the project.
-Each exercise consists of a text describing the problem, a configuration of the 
-evaluation (machine-readable instructions on how to evaluate solutions to the
-exercise), time and memory limits for all supported runtimes (e.g. programming
-languages), a configuration for calculating the final score and a set of inputs
-and reference outputs. Exercises are created by instructed privileged users.
-Assigning an exercise to a group means choosing one of the available exercises
-and specifying additional properties: a deadline (optionally a second deadline),
-a maximum amount of points, a maximum number of submissions and a list of
-supported runtime environments.
-
-The typical use cases for the user roles are the following:
-
-- **student**
-	- create a new user account via a registration form
-	- join groups (e.g., the courses he attends)
-	- get assignments in the groups
-	- submit a solution to an assignment -- upload one source file and start the
-	  evaluation process
-	- view the results of the solution -- which parts succeeded and failed, the total
-	  number of the acquired points, bonus points
-- **supervisor** (similar to CodEx *operator*)
-	- create a new exercise -- create description text and evaluation configuration
-	  (for each programming environment), upload testing inputs and outputs
-	- assign an exercise to a group -- choose an exercise and set the deadlines, 
-	  the number of allowed submissions, the weights of all test cases and the amount
-	  of points for the correct solutions
-	- modify an assignment
-	- view all of the results of the students in a group
-	- review the automatic solution evaluation -- view the submitted source files 
-	  and optionally set bonus points (including negative points)
-- **administrator**
-	- create groups
-	- alter user privileges -- make supervisor accounts
-	- check system logs
-
-### Exercise Evaluation Chain
-
-The most important part of the system is the evaluation of solutions submitted by
-the students. The process from the source code to final results (score) is
-described in more detail below to give readers a solid overview of what is happening
-during the evaluation process.
-
-The first thing students have to do is to submit their solutions through the web user
-interface. The system checks assignment invariants (e.g., deadlines, number of
-submissions) and stores the submitted code. The runtime environment is
-automatically detected based on the extension of the input file, and a suitable evaluation
-configuration type is chosen (one exercise can have multiple variants, for
-example C and Java is allowed). This exercise configuration is then used for
-the evaluation process.
-
-There is a pool of uniform worker machines dedicated to evaluation jobs.
-Incoming jobs are kept in a queue until a free worker picks them. Workers are
-capable of a sequential evaluation of jobs, one at a time.
-
-The worker obtains the solution and its evaluation configuration, parses it and
-starts executing the instructions contained. Each job should have more test
-cases which examine invalid inputs, corner cases and data of different sizes to
-estimate the program complexity. It is crucial to keep the computer running the worker
-secure and stable, so a sandboxed environment is used for dealing with an
-unknown source code. When the execution is finished, results are saved, and the
-student is notified.
-
-The output of the worker contains data about the evaluation, such as time and
-memory spent on running the program for each test input and whether its output
-is correct. The system then calculates a numeric score from the data which is
-presented to the student. If the solution is incorrect (e.g., incorrect output,
-exceeds memory or time limits), error messages are also displayed to the student.
-
-### Possible Improvements
-
-The current system is old, but robust. There were no major security incidents
-in the course of its usage. However, from the present day perspective there are
-several major drawbacks:
-
-- **web interface** -- The web interface is simple and fully functional.
-  However, the recent rapid development in web technologies provides us with new
-  possibilities of making web interfaces.
-- **public API** -- CodEx offers a very limited public XML API based on outdated
-  technologies that are not sufficient for users who would like to create their
-  custom interfaces such as a command line tool or a mobile application.
-- **sandboxing** -- the MO-Eval sandbox is based on the principle of monitoring
-  system calls and blocking the forbidden ones. This can be sufficient with
-  single-threaded programs, but proves to be difficult with multi-threaded ones.
-  Nowadays, parallelism is a very important area of computing, it is required that
-  multi-threaded programs can be securely tested as well.
-- **instances** -- Different ways of CodEx use require separate
-  installations (e.g., Programming I and II, Java, C#). This configuration is
-  not user friendly as students have to register in each installation separately
-  and burdens administrators with unnecessary work. The CodEx architecture does not
-  allow sharing workers between installations which results in an inefficient
-  use of hardware for evaluation.
-- **task extensibility** -- There is a need to test and evaluate complicated
-  programs for courses such as *Parallel programming* or *Compiler principles*,
-  which have a more difficult evaluation chain than simple
-  *compilation/execution/evaluation* provided by CodEx.
-
-## Requirements
-
-There are many different formal requirements for the system. Some of them
-are necessary for any system for source code evaluation, some of them are
-specific for university deployment and some of them arose during the ten year
-long lifetime of the old system. There are not many ways of improving CodEx
-experience from the perspective of a student, but a lot of feature requests come
-from administrators and supervisors. The ideas were gathered mostly from our
-personal experience with the system and from meetings with the faculty staff
-who use the current system.
-
-In general, CodEx features should be preserved, so only the differences are
-presented here. For clear arrangement, all the requirements and wishes are
-presented in groups by the user categories.
-
-### Requirements of The Users
-
-- _group hierarchy_ -- creating an arbitrarily nested tree structure should be
-  supported to keep related groups together, such as in the example
-  below. CodEx supported only a flat group structure. A group hierarchy also
-  allows to archive data from the past courses.
-
-```
-  Summer term 2016
-  |-- Language C# and .NET platform
-  |   |-- Labs Monday 10:30
-  |   `-- Labs Thursday 9:00
-  |-- Programming I
-  |   |-- Labs Monday 14:00
-   ...
-```
-
-- _a database of exercises_ -- teachers should be able to filter the displayed
-  exercises according to several criteria, for example by the supported runtime
-  environments or by the author. It should also be possible to link exercises to a group
-  so that group supervisors do not have to browse hundreds of exercises when
-  their group only uses a few of them
-- _advanced exercises_ -- the system should support a more advanced evaluation
-  pipeline than basic *compilation/execution/evaluation* which is in CodEx
-- _customizable grading system_ -- teachers need to specify the way of
-  calculating the final score which will be allocated to the submissions
-  depending on their correctness and quality
-- _marking a solution as accepted_ -- a supervisor should be able to choose
-  one of the submitted solutions of a student as accepted. The score of this
-  particular solution will be used as the score which the student receives
-  for the given assignment instead of the one with the highest score.
-- _solution resubmission_ -- teachers should be able to edit the solutions of the
-  student and privately resubmit them, optionally saving all results (including
-  temporary ones); this feature can be used to quickly fix obvious errors in the
-  solution and see if it is otherwise correct
-- _localization_ -- all texts (the UI and the assignments of the exercises) should
-  be translatable into several languages
-- _formatted texts of assignments_ -- Markdown or another lightweight markup language
-  should be supported for the formatting of the texts of the exercises
-- _comments_ -- adding both private and public comments to exercises, tests and
-  solutions should be supported
-- _plagiarism detection_
-
-### Administrative Requirements
-
-- _independent user interface_ -- the system should allow the use of an alternative
-  user interface, such as a command line client; implementation of such clients
-  should be as straightforward as possible
-- _privilege separation_ -- there should be at least two roles -- _student_ and
-  _supervisor_. The cases when a student of a course is also a teacher of another
-  course must be handled correctly.
-- _alternative authentication methods_ -- logging in through a university
-  authentication system (e.g. LDAP) and potentially other services, such as
-  Github or some other OAuth service, should be supported
-- _querying SIS_ -- loading user data from the university information system (SIS)
-  should be supported
-- _sandboxing_ -- there should be a more advanced sandbox which supports
-  execution of parallel programs and an easy integration of different programming
-  environments and tools; the sandboxed environment should have the minimum
-  possible impact on the measurement of results (most importantly on the measured
-  duration of execution)
-- _heterogeneous worker pool_ -- there must be a support for submission evaluation
-  in multiple programming environments in a single installation to avoid
-  unacceptable workload of the administrator (i.e., maintaining a separate
-  installation for every course) and high hardware requirements
-- advanced low-level evaluation flow configuration with high-level abstraction
-  layer for ordinary configuration cases; the configuration should be able to
-  express more complicated flows than just compiling a source code and running
-  the program against the test inputs -- for example, some exercises need to build
-  the source code with a tool, run some tests, then run the program through
-  another tool and perform additional tests
-- use of modern technologies with state-of-the-art compilers
-
-### Non-functional Requirements
-
-- _no installation_ -- the primary user interface of the system must be
-  accessible on the computers of the users without the need to install any
-  additional software except for a web browser which is installed on almost
-  every personal computer
-- _performance_ -- the system must be ready for at least hundreds of students
-  and tens of supervisors who are using it at the same time
-- _automated deployment_ -- all of the components of the system must be easy to
-  deploy in an automated fashion
-- _open source licensing_ -- the source code should be released under a
-  permissive licence allowing further development; this also applies to the used
-  libraries and frameworks
-- _multi-platform worker_ -- worker machines running Linux, Windows and
-  potentially other operating systems must be supported
-
-### Conclusion
-
-The survey shows that there are a lot of different requirements and wishes for
-the new system. When the system is ready, it is likely that there will be new
-ideas on how to use the system and thus the system must be designed to be easily
-extendable so that these new ideas can be easily implemented, either by us or
-community members. This also means that widely used programming languages and
-techniques should be used so the programmers can quickly understand the code and
-make changes easily.
-
-## Related work
-
-To find out the current state in the field of automatic grading systems, we conducted
-a short market survey of the field of automatic grading systems at universities,
-programming contests, and other places where similar tools are used.
-
-This is not a complete list of available evaluators, but only of a few projects
-which are used these days and can be of an inspiration for our project. Each
-project on the list is provided with a brief description and some of its key features.
-
-### Progtest
-
-[Progtest](https://progtest.fit.cvut.cz/) is a private project of [FIT
-ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
-Bash programming, and knowledge-based quizzes. Each submitted solution can receive
-several bonus points or penalties and also a few hints can be attached of
-what is incorrect in the solution. It is very strict with the source code quality,
-for example the `-pedantic` option of GCC is used; Valgrind is used for detection of
-memory leaks; array boundaries are checked via the `mudflap` library.
-
-### Codility
-
-[Codility](https://codility.com/) is a web based solution primarily targeted at
-company recruiters. It is a commercial product available as SaaS and it
-supports 16 programming languages. The
-[UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png)
-of Codility is [opensource](https://github.com/Codility/cui), the rest of source
-code is not available. One interesting feature is the 'task timeline' -- the captured
-progress of writing the code for each user.
-
-### CMS
-
-[CMS](http://cms-dev.github.io/index.html) is an opensource distributed system
-for running and organizing programming contests. It is written in Python and
-contains several modules. CMS supports C/C++, Pascal, Python, PHP, and Java
-programming languages. PostgreSQL is a single point of failure, all modules
-heavily depend on the database connection. Task evaluation can be only a three
-step pipeline -- compilation, execution, evaluation. Execution is performed in
-[Isolate](https://github.com/ioi/isolate), a sandbox written by the consultant
-of our project, Mgr. Martin Mareš, Ph.D.
-
-### MOE
-
-[MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C
-and Python. It does not provide a default GUI interface, all actions have to be
-performed from the command line. The system does not evaluate submissions in real
-time, results are computed in a batch mode after the exercise deadline, using Isolate
-for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but
-the system is obsolete in general.
-
-### Kattis
-
-[Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean
-and functional web UI, but the rest of the application is too simple. A nice
-feature is the usage of a [standardized 
-format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for
-exercises. Kattis is primarily used by programming contest organizers, company
-recruiters and also some universities.
-
-# Analysis
-
-None of the existing projects we came across meets all the features requested
-by the new system. There is no grading system which supports an arbitrary-length
-evaluation pipeline, so we have to implement this feature ourselves.
-No existing solution is extensible enough to be used as a base for the new system.
-After considering all these
-facts, a new system has to be written from scratch. This
-implies that only a subset of all the features will be implemented in the first
-version, and more of them will come in the following releases.
-
-The requested features are categorized based on priorities for the whole system. The
-highest priority is the functionality present in the current CodEx. It is a base
-line for being useful in the production environment. The design of the new solution
-should allow that the system will be extended easily. The ideas from faculty staff
-have lower priority, but most of them will be implemented as part of the project.
-The most complicated tasks from this category are an advanced low-level evaluation
-configuration format, use of modern tools, connection to a university system, and
-combining the currently separate instances into one installation of the system.
-
-Other tasks are scheduled
-for the next releases after the first version of the project is completed.
-Namely, these are a high-level exercise evaluation configuration with a
-user-friendly UI, SIS integration (when a public API becomes available for
-the system), and a command-line submission tool. Plagiarism detection is not
-likely to be part of any release in near future unless someone else implements a
-sufficiently capable and extendable solution -- this problem is too complex to be
-solved as a part of this project.
-
-We named the new project **ReCodEx -- ReCodEx Code Examiner**. The name
-should point to the old CodEx, **Re** as part of the name means
-redesigned, rewritten, renewed, or restarted.
-
-At this point there is a clear idea of how the new system will be used and what are
-the major enhancements for the future releases. With this in mind, it is possible
-to sketch the overall architecture. To sum this up, here is a list of key features of the
-new system. They come from the previous research of the current drawbacks of the system,
-reasonable wishes of university users, and our major design choices:
-
-- modern HTML5 web frontend written in JavaScript using a suitable framework
-- REST API communicating with a persistent database, evaluation backend, and a file server
-- evaluation backend implemented as a distributed system on top of a messaging
-  framework with a master-worker architecture
-- multi-platform worker supporting Linux and Windows environments (latter without
-  a sandbox, no suitable general purpose tool available yet)
-- evaluation procedure configured in a human readable text file, consisting of
-  small tasks forming an arbitrary oriented acyclic dependency graph
-
-## Basic Concepts
-
-The requirements specify that the user
-interface must be accessible to students without the need to install additional
-software. This immediately implies that users have to be connected to the
-Internet. Nowadays, there are two main ways
-of designing graphical user interfaces -- as a native application or a web page.
-Creating a user-friendly and multi-platform application with graphical UI
-is almost impossible because of the large number of different operating systems.
-These applications typically require installation or at least downloading
-its files (source codes or binaries). On the other hand, distributing a web
-application is easier, because every personal computer has an internet browser
-installed. Browsers support a (mostly) unified and standardized
-environment of HTML5 and JavaScript. CodEx is also a web application and
-everybody seems to be satisfied with this fact. There are other communicating channels
-most programmers use, such as e-mail or git, but they are inappropriate for
-designing user interfaces on top of them.
-
-It is clear from the assignment of the project 
-that the system has to keep personalized data of the users. User data cannot
-be publicly available, which implies necessity of user authentication.
-The application also has to support
-multiple ways of authentication (e.g., university authentication systems, a company
-LDAP server, an OAuth server), and permit adding more security measures in the
-future, such as two-factor authentication.
-
-Each user has a specific role in the system. From the assignment it is required to
-have at least two such roles, _student_ and _supervisor_. However, it is advisible
-to add an _administrator_ level for users who take care of the system as a whole and are
-responsible for the setup, monitoring, or updates. The student role has the
-minimum access rights, basically a student can only view assignments and submit solutions.
-Supervisors have more authority, so they can create exercises and assignments, and
-view results of their students. From the organization of the university, one possible
-level could be introduced, _course guarantor_. However, from real experience all
-duties related with lecturing of labs are already associated with supervisors,
-so this role does not seem useful. In addition, no one requested more than a three
-level privilege scheme.
-
-School labs are lessons for some students lead by supervisors. All students in a
-lab have the same homework and supervisors evaluate their solutions. This
-arrangement has to be transferred into the new system. The groups in the system 
-correspond to the real-life labs. This concept was already discussed in the
-previous chapter including the need for a hierarchical structure of the groups.
-
-To allow restriction of group members in ReCodEx, there are two types of groups
--- _public_ and _private_. Public groups are open for all registered users, but
-to become a member of a private group, one of its supervisors has to add the
-user to the group. This could be done automatically at the beginning of a term with data
-from the information system, but unfortunately there is no API for this yet.
-However, creating this API is now being considered by university staff.
-
-Supervisors using CodEx in their labs usually set a minimum amount of points
-required to get a credit. These points can be acquired by solving assigned
-exercises. To show users whether they already have enough points, ReCodEx also
-supports setting this limit for the groups. There are two equal ways of setting
-a limit -- an absolute number fo points or a percentage of the total possible
-number of points. We decided to implement the latter of these possibilities
-and we call it the threshold.
-
-Our university has a few partners among grammar schools. There was an idea, that they
-could use CodEx for teaching IT classes. To simplify the setup for
-them, all the software and hardware would be provided by the university as a
-SaaS. However, CodEx is not prepared for this
-kind of usage and no one has the time to manage another separate instance. With
-ReCodEx it is possible to offer a hosted environment as a service to other
-subjects.
-
-The system is divided into multiple separate units called _instances_.
-Each instance has its own set of users and groups. Exercises can be optionally
-shared. The rest of the system (the API server and the evaluation backend) is shared
-between the instances. To keep a track of the active instances and allow access
-to the infrastructure to other, paying, customers, each instance must have a
-valid _licence_ to allow its users to submit their solutions.
-Each licence is granted for a specific period of time and can be revoked in advance
-if the subject does not conform with approved terms and conditions.
-
-The problems the students solve are broken down into two parts in the system:
-
-- the problem itself (an _exercise_),
-- and its _assignment_
-
-Exercises only describe the problem and provide testing data with the description
-of how to evaluate them. In fact, these are templates for the assignments. A particular
-assignment then contains data from the exercise and some additional metadata, which can be
-different for every assignment of the same exercise (e.g., the deadline, maximum number
-of points).
-
-### Evaluation Unit Executed by ReCodEx
-
-One of the bigger requests for the new system is to support a complex
-configuration of execution pipeline. The idea comes from lecturers of Compiler
-principles class who want to migrate their semi-manual evaluation process to
-CodEx. Unfortunately, CodEx is not capable of such complicated exercise setup.
-None of evaluation systems we found can handle such task, so design from
-scratch is needed.
-
-There are two main approaches to design a complex execution configuration. It
-can be composed of small amount of relatively big components or much more small
-tasks. Big components are easy to write and help keeping the configuration
-reasonably small. However, these components are designed for current problems
-and they might not hold well against future requirements. This can be solved by
-introducing a small set of single-purposed tasks which can be composed together.
-The whole configuration becomes bigger, but more flexible for new conditions.
-Moreover, they will not require as much programming effort as bigger evaluation
-units. For better user experience, configuration generators for some common
-cases can be introduced.
-
-A goal of ReCodEx is to be continuously developed and used for many years.
-Therefore, we chose to use smaller tasks, because this approach is better for
-future extensibility. Observation of CodEx system shows that only a few tasks
-are needed. In an extreme case, only one task is enough -- execute a binary.
-However, for better portability of configurations between different systems it
-is better to implement a reasonable subset of operations ourselves without
-calling binaries provided by the system directly. These operations are copy
-file, create new directory, extract archive and so on, altogether called
-internal tasks. Another benefit from custom implementation of these tasks is
-guarantied safety, so no sandbox needs to be used as in external tasks case.
-
-For a job evaluation, the tasks need to be executed sequentially in a specified
-order. Running independent tasks is possible, but there are complications --
-exact time measurement requires a controlled environment with as few
-interruptions as possible from other processes. It would be possible to run
-tasks that do not need exact time measurement in parallel, but in this case a
-synchronization mechanism has to be developed to exclude parallelism for
-measured tasks. Usually, there are about four times more unmeasured tasks than
-tasks with time measurement, but measured tasks tend to be much longer. With
-[Amdahl's law](https://en.wikipedia.org/wiki/Amdahl's_law) in mind, the
-parallelism does not seem to provide a notable benefit in overall execution
-speed and brings trouble with synchronization. Moreover, most of the internal
-tasks are also limited by IO speed (most notably copying and downloading files
-and reading archives). However, if there are performance issues, this approach
-could be reconsidered, along with using a ram disk for storing supplementary
-files.
-
-It seems that connecting tasks into directed acyclic graph (DAG) can handle all
-possible problem cases. None of the authors, supervisors and involved faculty
-staff can think of a problem that cannot be decomposed into tasks connected in a
-DAG. The goal of evaluation is to satisfy as many tasks as possible. During
-execution there are sometimes multiple choices of next task. To control that,
-each task can have a priority, which is used as a secondary ordering criterion.
-For better understanding, here is a small example.
-
-![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png)
-
-The _job root_ task is an imaginary single starting point of each job. When the
-_CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should
-be deterministic by position in configuration file -- tasks stated earlier
-should be executed earlier). The task priorities guaranties, that after
-_CompileA_ task all dependent tasks are executed before _CompileB_ task (they
-have higher priority number). To sum up, connection of tasks represents
-dependencies and priorities can be used to order unrelated tasks and with this
-provide a total ordering of them. For well written jobs the priorities may not
-be so useful, but they can help control execution order for example to avoid
-situation, where each test of the job generates large temporary file and there
-is a one valid execution order which keeps all the temporary files for later
-processing at one time. Better approach is to finish execution of one test,
-clean the big temporary file and proceed with following test. If there is an
-ambiguity in task ordering at this point, they are executed in order of input
-task configuration.
-
-The total linear ordering of tasks can be made easier with just executing them
-in order of input configuration. But this structure cannot handle cases, when a
-task fails very well. There is no easy way of telling which task should be
-executed next. However, this issue can be solved with graph structured
-dependencies of the tasks. In graph structure, it is clear that all dependent
-tasks have to be skipped and execution must be resumed with a non related task.
-This is the main reason, why the tasks are connected in a DAG.
-
-For grading there are several important tasks. First, tasks executing submitted
-code need to be checked for time and memory limits. Second, outputs of judging
-tasks need to be checked for correctness (represented by return value or by data
-on standard output) and should not fail. This division can be transparent for
-backend, each task is executed the same way. But frontend must know which tasks
-from whole job are important and what is their kind. It is reasonable, to keep
-this piece of information alongside the tasks in job configuration, so each task
-can have a label about its purpose. Unlabeled tasks have an internal type
-_inner_. There are four categories of tasks:
-
-- _initiation_ -- setting up the environment, compiling code, etc.; for users
-  failure means error in their sources which are not compatible with running it
-  with examination data
-- _execution_ -- running the user code with examination data, must not exceed
-  time and memory limits; for users failure means wrong design, slow data
-  structures, etc.
-- _evaluation_ -- comparing user and examination outputs; for user failure means
-  that the program does not compute the right results
-- _inner_ -- no special meaning for frontend, technical tasks for fetching and
-  copying files, creating directories, etc.
-
-Each job is composed of multiple tasks of these types which are semantically
-grouped into tests. A test can represent one set of examination data for user
-code. To mark the grouping, another task label can be used. Each test must have
-exactly one _evaluation_ task (to show success or failure to users) and
-arbitrary number of tasks with other types.
-
-### Evaluation Progress State
-
-Users want to know the state of their submitted solution (whether it is waiting
-in a queue, compiling, etc.). The very first idea would be to report a state
-based on "done" messages from compilation, execution and evaluation like many
-evaluation systems are already providing. However ReCodEx has a more complicated
-execution pipeline where there can be more compilation or execution tasks per
-test and also other internal tasks that control the job execution flow. 
-
-The users do not know the technical details of the evaluation and data about
-completion of tasks may confuse them. A solution is to show users only
-percentual completion of the job without any additional information about task
-types. This solution works well for all of the jobs and is very user friendly.
-
-It is possible to expand upon this by adding a special "send progress message"
-task to the job configuration that would mark the completion of a specific part
-of the evaluation. However, the benefits of this feature are not worth the
-effort of implementing it and unnecessarily complicating the job configuration
-files.
-
-### Results of Evaluation
-
-The evaluation data have to be processed and then presented in human readable
-form. This is done through a one numeric value called points. Also, results of
-job tests should be available to know what kind of error is in the solution. For
-more debugging, outputs of tasks could be optionally available for the users.
-
-#### Scoring and Assigning Points
-
-The overall concept of grading solutions was presented earlier. To briefly
-remind that, backend returns only exact measured values (used time and memory,
-return code of the judging task, ...) and on top of that one value is computed.
-The way of this computation can be very different across supervisors, so it has
-to be easily extendable. The best way is to provide interface, which can be
-implemented and any sort of magic can return the final value.
-
-We found out several computational possibilities. There is basic arithmetic,
-weighted arithmetic, geometric and harmonic mean of results of each test (the
-result is logical value succeeded/failed, optionally with weight), some kind of
-interpolation of used amount of time for each test, the same with used memory
-amount and surely many others. To keep the project simple, we decided to design
-appropriate interface and implement only weighted arithmetic mean computation,
-which is used in about 90% of all assignments. Of course, different scheme can
-be chosen for every assignment and also can be configured -- for example
-specifying test weights for implemented weighted arithmetic mean. Advanced ways
-of computation can be implemented on demand when there is a real demand for
-them.
-
-To avoid assigning points for insufficient solutions (like only printing "File
-error" which is the valid answer in two tests), a minimal point threshold can be
-specified. If the solution is to get less points than specified, it will get
-zero points instead. This functionality can be embedded into grading computation
-algorithm itself, but it would have to be present in each implementation
-separately, which is not maintainable. Because of this the threshold feature is
-separated from score computation.
-
-Automatic grading cannot reflect all aspects of submitted code. For example,
-structuring the code, number and quality of comments and so on. To allow
-supervisors bring these manually checked things into grading, there is a concept
-of bonus points. They can be positive or negative. Generally the solution with
-the most assigned points is marked for grading that particular assignment.
-However, if supervisor is not satisfied with student solution (really bad code,
-cheating, ...) he/she assigns the student negative bonus points. To prevent
-overriding this decision by system choosing another solution with more points or
-even student submitting the same code again which evaluates to more points,
-supervisor can mark a particular solution as marked and used for grading instead
-of solution with the most points.
-
-#### Evaluation Outputs
-
-In addition to the exact measured values used for score calculation described in
-previous chapter, there are also text or binary outputs of the executed tasks.
-Knowing them helps users identify and solve their potential issues, but on the
-other hand this can lead to possibility of leaking input data. This may lead
-students to hack their solutions to pass just the ReCodEx testing cases instead
-of properly solving the assigned problem. The usual approach is to keep this
-information private. This was also strongly recommended by Martin Mareš, who has
-experience with several programming contests.
-
-The only one exception from hiding the logs are compilation outputs, which can
-help students a lot during troubleshooting and there is only a small possibility
-of input data leakage. The supervisors have access to all of the logs and they
-can decide if students are allowed to see the compilation outputs.
-
-Note that due to lack of frontend developers, showing compilation logs to the
-students is not implemented in the very first release of ReCodEx.
-
-### Persistence
-
-Previous parts of analysis show that the system has to keep some state. This
-could be user settings, group membership, evaluated assignments and so on. The
-data have to be kept across restart, so persistence is important decision
-factor. There are several ways how to save structured data:
-
-- plain files
-- NoSQL database
-- relational database
-
-Another important factor is amount and size of stored data. Our guess is about
-1000 users, 100 exercises, 200 assignments per year and 20000 unique solutions
-per year. The data are mostly structured and there are a lot of them with the
-same format. For example, there is a thousand of users and each one has the same
-values -- name, email, age, etc. These data items are relatively small, name
-and email are short strings, age is an integer. Considering this, relational
-databases or formatted plain files (CSV for example) fit best for them.
-However, the data often have to support searching, so they have to be
-sorted and allow random access for resolving cross references. Also, addition
-and deletion of entries should take reasonable time (at most logarithmic time
-complexity to number of saved values). This practically excludes plain files, so
-we decided to use a relational database.
-
-On the other hand, there is data with basically no structure and much larger
-size. These can be evaluation logs, sample input files for exercises or sources
-submitted by students. Saving this kind of data into a relational database is
-not appropriate. It is better to keep them as ordinary files or store them in
-some kind of NoSQL database. Since they are already files and do not need to be
-backed up in multiple copies, it is easier to keep them as ordinary files in the
-filesystem. Also, this solution is more lightweight and does not require
-additional dependencies on third-party software. Files can be identified using
-their filesystem paths or a unique index stored as a value in a relational
-database. Both approaches are equally good, final decision depends on the actual
-implementation.
-
-## Structure of The Project
-
-The ReCodEx project is divided into two logical parts -- the *backend* and the
-*frontend* -- which interact with each other and together cover the whole area
-of code examination. Both of these logical parts are independent of each other
-in the sense of being installed on separate machines at different locations and
-that one of the parts can be replaced with a different implementation and as
-long as the communication protocols are preserved, the system will continue
-working as expected.
-
-### Backend
-
-Backend is the part which is responsible solely for the process of evaluating
-a solution of an exercise. Each evaluation of a solution is referred to as a
-*job*. For each job, the system expects a configuration document of the job,
-supplementary files for the exercise (e.g., test inputs, expected outputs,
-predefined header files), and the solution of the exercise (typically source
-codes created by a student). There might be some specific requirements for the
-job, such as a specific runtime environment, specific version of a compiler or
-the job must be evaluated on a processor with a specific number of cores. The
-backend infrastructure decides whether it will accept a job or decline it based
-on the specified requirements. In case it accepts the job, it will be placed in
-a queue and it will be processed as soon as possible. 
-
-The backend publishes the progress of processing of the queued jobs and the
-results of the evaluations can be queried after the job processing is finished.
-The backend produces a log of the evaluation which can be used for further score
-calculation or debugging.
-
-To make the backend scalable, there are two necessary components -- the one
-which will execute jobs and the other which will distribute jobs to the
-instances of the first one. This ensures scalability in manner of parallel
-execution of numerous jobs which is exactly what is needed. Implementation of
-these services are called **broker** and **worker**, first one handles
-distribution, the other one handles execution. 
-
-These components should be enough to fulfill all tasks mentioned above, but for
-the sake of simplicity and better communication, gateways with frontend two
-other components were added -- **fileserver** and **monitor**. Fileserver is a
-simple component whose purpose is to store files which are exchanged between
-frontend and backend. Monitor is also quite a simple service which is able to
-forward job progress data from worker to web application. These two additional
-services are at the border between frontend and backend (like gateways) but
-logically they are more connected with backend, so it is considered they belong
-there.
-
-### Frontend
-
-Frontend on the other hand is responsible for providing users with convenient
-access to the backend infrastructure and interpreting raw data from backend
-evaluation. 
-
-There are two main purposes of the frontend -- holding the state of the whole
-system (database of users, exercises, solutions, points, etc.) and presenting
-the state to users through some kind of a user interface (e.g., a web
-application, mobile application, or a command-line tool). According to
-contemporary trends in development of frontend parts of applications, we decided
-to split the frontend in two logical parts -- a server side and a client side.
-The server side is responsible for managing the state and the client side gives
-instructions to the server side based on the inputs from the user. This
-decoupling gives us the ability to create multiple client side tools which may
-address different needs of the users.
-
-The frontend developed as part of this project is a web application created with
-the needs of the Faculty of Mathematics and Physics of the Charles university in
-Prague in mind. The users are the students and their teachers, groups correspond
-to the different courses, the teachers are the supervisors of these groups. This
-model is applicable to the needs of other universities, schools, and IT
-companies, which can use the same system for their needs. It is also possible to
-develop a custom frontend with own user management system and use the
-possibilities of the backend without any changes.
-
-### Possible Connection
-
-One possible configuration of ReCodEx system is illustrated on following
-picture, where there is one shared backend with three workers and two separate
-instances of whole frontend. This configuration may be suitable for MFF UK --
-basic programming course and KSP competition. But maybe even sharing web API and
-fileserver with only custom instances of client (web app or own implementation)
-is more likely to be used. Note, that connections between components are not
-fully accurate.
-
-![Overall architecture](https://github.com/ReCodEx/wiki/blob/master/images/Overall_Architecture.png)
-
-In the following parts of the documentation, both the backend and frontend parts
-will be introduced separately and covered in more detail. The communication
-protocol between these two logical parts will be described as well.
-
-
-## Implementation Analysis
-
-Some of the most important implementation problems or interesting observations
-will be discussed in this chapter.
-
-### Communication Between the Backend Components
-
-Overall design of the project is discussed above. There are bunch of components
-with their own responsibility. Important thing to design is communication of
-these components. To choose a suitable protocol, there are some additional
-requirements that should be met:
-
-- reliability -- if a message is sent between components, the protocol has to
-  ensure that it is received by target component
-- working over IP protocol
-- multi-platform and multi-language usage
-
-TCP/IP protocol meets these conditions, however it is quite low level and
-working with it usually requires working with platform dependent non-object API.
-Often way to reflect these reproaches is to use some framework which provides
-better abstraction and more suitable API. We decided to go this way, so the
-following options are considered:
-
-- CORBA (or some other form of RPC) -- CORBA is a well known framework for
-  remote procedure calls. There are multiple implementations for almost every
-  known programming language. It fits nicely into object oriented programming
-  environment.
-- RabbitMQ -- RabbitMQ is a messaging framework written in Erlang. It features a
-  message broker, to which nodes connect and declare the message queues they
-  work with. It is also capable of routing requests, which could be a useful
-  feature for job load-balancing. Bindings exist for a large number of languages
-  and there is a large community supporting the project.
-- ZeroMQ -- ZeroMQ is another messaging framework, which is different from
-  RabbitMQ and others (such as ActiveMQ) because it features a "brokerless
-  design". This means there is no need to launch a message broker service to
-  which clients have to connect -- ZeroMQ based clients are capable of
-  communicating directly. However, it only provides an interface for passing
-  messages (basically vectors of 255B strings) and any additional features such
-  as load balancing or acknowledgement schemes have to be implemented on top of
-  this. The ZeroMQ library is written in C++ with a huge number of bindings. 
-
-CORBA is a large framework that would satisfy all our needs, but we are aiming
-towards a more loosely-coupled system, and asynchronous messaging seems better
-for this approach than RPC. Moreover, we rarely need to receive replies to our
-requests immediately.
-
-RabbitMQ seems well suited for many use cases, but implementing a job routing
-mechanism between heterogeneous workers would be complicated -- we would probably
-have to create a separate load balancing service, which cancels the advantage of
-a message broker already being provided by the framework. It is also written in
-Erlang, which nobody from our team understands. 
-
-ZeroMQ is the best option for us, even with the drawback of having to implement
-a load balancer ourselves (which could also be seen as a benefit and there is a
-notable chance we would have to do the same with RabbitMQ). It also gives us
-complete control over the transmitted messages and communication patterns.
-However, all of the three options would have been possible to use.
-
-### File Transfers
-
-There has to be a way to access files stored on the fileserver (and also upload
-them )from both worker and frontend server machines. The protocol used for this
-should handle large files efficiently and be resilient to network failures.
-Security features are not a primary concern, because all communication with the
-fileserver will happen in an internal network. However, a basic form of
-authentication can be useful to ensure correct configuration (if a development
-fileserver uses different credentials than production, production workers will
-not be able to use it by accident). Lastly, the protocol must have a client
-library for platforms (languages) used in the backend. We will present some of
-the possible options:
-
-- HTTP(S) -- a de-facto standard for web communication that has far more
-  features than just file transfers. Thanks to being used on the web, a large
-  effort has been put into the development of its servers. It supports
-  authentication and it can handle short-term network failures (thanks to being
-  built on TCP and supporting resuming interrupted transfers). We will use HTTP
-  for communication with clients, so there is no added cost in maintaining a
-  server. HTTP requests can be made using libcurl.
-- FTP -- an old protocol designed only for transferring files. It has all the
-  required features, but doesn't offer anything over HTTP. It is also supported
-  by libcurl.
-- SFTP -- a file transfer protocol most frequently used as a subsystem of the
-  SSH protocol implementations. It doesn't provide authentication, but it
-  supports working with large files and resuming failed transfers. The libcurl
-  library supports SFTP.
-- A network-shared file system (such as NFS) -- an obvious advantage of a
-  network-shared file system is that applications can work with remote files the
-  same way they would with local files. However, it brings an overhead for the
-  administrator, who has to configure access to this filesystem for every
-  machine that needs to access the storage. 
-- A custom protocol over ZeroMQ -- it is possible to design a custom file
-  transfer protocol that uses ZeroMQ for sending data, but it is not a trivial
-  task -- we would have to find a way to transfer large files efficiently, 
-  implement an acknowledgement scheme and support resuming transfers. Using
-  ZeroMQ as the underlying layer does not help a lot with this. The sole
-  advantage of this is that the backend components would not need another
-  library for communication.
-
-We chose HTTPS because it is widely used and client libraries exist in all
-relevant environments. In addition, it is highly probable we will have to run an
-HTTP server, because it is intended for ReCodEx to have a web frontend.
-
-### Frontend - Backend Communication
-
-Our choices when considering how clients will communicate with the backend have
-to stem from the fact that ReCodEx should primarily be a web application. This
-rules out ZeroMQ -- while it is very useful for asynchronous communication
-between backend components, it is practically impossible to use it from a web
-browser. There are several other options:
-
-- *WebSockets* -- The WebSocket standard is built on top of TCP. It enables a
-  web browser to connect to a server over a TCP socket. WebSockets are
-  implemented in recent versions of all modern web browsers and there are
-  libraries for several programming languages like Python or JavaScript (running
-  in Node.js).  Encryption of the communication over a WebSocket is supported as
-  a standard.
-- *HTTP protocol* -- The HTTP protocol is a state-less protocol implemented on
-  top of the TCP protocol.  The communication between the client and server
-  consists of a requests sent by the client and responses to these requests sent
-  back by the sever. The client can send as many requests as needed and it may
-  ignore the responses from the server, but the server must respond only to the
-  requests of the client and it cannot initiate communication on its own.
-  End-to-end encryption can be achieved easily using SSL (HTTPS).
-
-We chose the HTTP(S) protocol because of the simple implementation in all sorts
-of operating systems and runtime environments on both the client and the server
-side.
-
-The API of the server should expose basic CRUD (Create, Read, Update, Delete)
-operations. There are some options on what kind of messages to send over the
-HTTP:
-
-- SOAP -- a protocol for exchanging XML messages. It is very robust and complex.
-- REST -- is a stateless architecture style, not a protocol or a technology. It
-  relies on HTTP (but not necessarily) and its method verbs (e.g., GET, POST,
-  PUT, DELETE). It can fully implement the CRUD operations.
-
-Even though there are some other technologies we chose the REST style over the
-HTTP protocol. It is widely used, there are many tools available for development
-and testing, and it is understood by programmers so it should be easy for a new
-developer with some experience in client-side applications to get to know with
-the ReCodEx API and develop a client application.
-
-A high level view of chosen communication protocols in ReCodEx can be seen in
-following image. Red arrows mark connections through ZeroMQ sockets, blue mark
-WebSockets communication and green arrows connect nodes that communicate through
-HTTP(S).
-
-![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
-
-### Job Configuration File
-
-As discussed previously in 'Evaluation Unit Executed by ReCodEx' an evaluation
-unit have form of a job which contains small tasks representing one piece of
-work executed by worker. This implies that jobs have to be passed from the
-frontend to the backend. The best option for this is to use some kind of
-configuration file which represents job details. The configuration file should
-be specified in the frontend and in the backend, namely worker, will be parsed
-and executed.
-
-There are many formats which can be used for configuration representation. The
-considered ones are:
-
-- *XML* -- broadly used general markup language which is flavoured with document
-  type definition (DTD) which can express and check XML file structure, so it
-  does not have to be checked within application. But XML with its tags can be
-  sometimes quite 'chatty' and extensive which is not desirable. And overly XML
-  with all its features and properties can be a bit heavy-weight.
-- *JSON* -- a notation which was developed to represent javascript objects. As
-  such it is quite simple, there can be expressed only: key-value structures,
-  arrays and primitive values. Structure and hierarchy of data is solved by
-  braces and brackets.
-- *INI* -- very simple configuration format which is able to represents only
-  key-value structures which can be grouped into sections. This is not enough
-  to represent a job and its tasks hierarchy.
-- *YAML* -- format which is very similar to JSON with its capabilities. But with
-  small difference in structure and hierarchy of configuration which is solved
-  not with braces but with indentation. This means that YAML is easily readable
-  by both human and machine.
-- *specific format* -- newly created format used just for job configuration.
-  Obvious drawback is non-existing parsers which would have to be written from
-  scratch.
-
-Given previous list of different formats we decided to use YAML. There are
-existing parsers for most of the programming languages and it is easy enough to
-learn and understand. Another choice which make sense is JSON but at the end
-YAML seemed to be better.
-
-Job configuration including design and implementation notes is described in 'Job
-configuration' appendix.
-
-#### Task Types
-
-From the low-level point of view there are only two types of tasks in the job.
-First ones are doing some internal operation which should work on all platforms
-or operating systems the same way. Second type of tasks are external ones which
-are executing external binary.
-
-Internal tasks should handle at least these operations:
-
-- *fetch* -- fetch single file from fileserver
-- *copy* -- copy file between directories
-- *remove* -- remove single file or folder
-- *extract* -- extract files from downloaded archive
-
-These internal operations are essential but many more can be eventually
-implemented.
-
-External tasks executing external binary should be optionally runnable in
-sandbox. But for security sake there is no reason to execute them outside of
-sandbox. So all external tasks are executed within a general a configurable
-sandbox. Configuration options for sandboxes will be called limits and there can
-be specified for example time or memory limits.
-
-#### Configuration File Content
-
-Content of the configuration file can be divided in two parts, first concerns
-about the job in general and its metadata, second one relates to the tasks and
-their specification.
-
-There is not much to express in general job metadata. There can be
-identification of the job and some general options, like enable/disable logging.
-But really necessary item is address of the fileserver from where supplementary
-files are downloaded. This option is crucial because there can be more
-fileservers and the worker have no other way how to figure out where the files
-might be.
-
-More interesting situation is about the metadata of tasks. From the initial
-analysis of evaluation unit and its structure there are derived at least these
-generally needed items:
-
-- *task identification* -- identificator used at least for specifying
-  dependencies
-- *type* -- as described before, one of: 'initiation', 'execution', 'evaluation'
-  or 'inner'
-- *priority* -- priority can additionally control execution flow in task graph
-- *dependencies* -- necessary item for constructing hierarchy of tasks into DAG
-- *execution command* -- command which should be executed withing this tasks
-  with possible parameters parameters
-
-Previous list of items is applicable both for internal and external tasks.
-Internal tasks do not need any more items but external do. Additional items are
-exclusively related to sandboxing and limitation:
-
-- *sandbox name* -- there should be possibility to have multiple sandboxes, so
-  identification of the right one is needed
-- *limits* -- hardware and software resources limitations
-    - *time limit* -- limits time of execution
-    - *memory limit* -- maximum memory which can be consumed by external program
-    - *I/O operations* -- limitation concerning disk operations
-    - *restrict filesystem* -- restrict or enable access to directories
-
-#### Supplementary Files
-
-Interesting problem arise with supplementary files (e.g., inputs, sample
-outputs). There are two main ways which can be observed. Supplementary files can
-be downloaded either on the start of the execution or during the execution.
-
-If the files are downloaded at the beginning, the execution does not really
-started at this point and thus if there are problems with network, worker will
-find it right away and can abort execution without executing a single task.
-Slight problems can arise if some of the files needs to have specific name (e.g.
-solution assumes that the input is `input.txt`). In this scenario the downloaded
-files cannot be renamed at the beginning but during the execution which is
-impractical and not easily observed by the authors of job configurations.
-
-Second solution of this problem when files are downloaded on the fly has quite
-opposite problem. If there are problems with network, worker will find it during
-execution when for instance almost whole execution is done. This is also not
-ideal solution if we care about burnt hardware resources. On the other hand
-using this approach users have advanced control of the execution flow and know
-what files exactly are available during execution which is from users
-perspective probably more appealing then the first solution. Based on that,
-downloading of supplementary files using 'fetch' tasks during execution was
-chosen and implemented.
-
-#### Job Variables
-
-Considering the fact that jobs can be executed within the worker on different
-machines with specific settings, it can be handy to have some kind of mechanism
-in the job configuration which will hide these particular worker details, most
-notably specific directory structure. For this purpose marks or signs can be
-used and can have a form of broadly used variables.
-
-Variables in general can be used everywhere where configuration values (not
-keys) are expected. This implies that substitution should be done after parsing
-of job configuration, not before. The only usage for variables which was
-considered is for directories within worker, but in future this might be subject
-to change.
-
-Final form of variables is `${...}` where triple dot is textual description.
-This format was used because of special dollar sign character which cannot be
-used within paths of regular filesystems. Braces are there only to border
-textual description of variable.
-
-### Broker
-
-The broker is responsible for keeping track of available workers and
-distributing jobs that it receives from the frontend between them. 
-
-#### Worker Management
-
-It is intended for the broker to be a fixed part of the backend infrastructure
-to which workers connect at will. Thanks to this design, workers can be added
-and removed when necessary (and possibly in an automated fashion), without
-changing the configuration of the broker. An alternative solution would be
-configuring a list of workers before startup, thus making them passive in the
-communication (in the sense that they just wait for incoming jobs instead of
-connecting to the broker). However, this approach comes with a notable
-administration overhead -- in addition to starting a worker, the administrator
-would have to update the worker list.
-
-Worker management must also take into account the possibility of worker
-disconnection, either because of a network or software failure (or termination).
-A common way to detect such events in distributed systems is to periodically
-send short messages to other nodes and expect a response. When these messages
-stop arriving, we presume that the other node encountered a failure. Both the
-broker and workers can be made responsible for initiating these exchanges and it
-seems that there are no differences stemming from this choice. We decided that
-the workers will be the active party that initiates the exchange.
-
-#### Scheduling
-
-Jobs should be scheduled in a way that ensures that they will be processed
-without unnecessary waiting. This depends on the fairness of the scheduling
-algorithm (no worker machine should be overloaded).
-
-The design of such scheduling algorithm is complicated by the requirements on
-the diversity of workers -- they can differ in operating systems, available
-software, computing power and many other aspects. 
-
-We decided to keep the details of connected workers hidden from the frontend,
-which should lead to a better separation of responsibilities and flexibility.
-Therefore, the frontend needs a way of communicating its requirements on the
-machine that processes a job without knowing anything about the available
-workers. A key-value structure is suitable for representing such requirements.
-
-With respect to these constraints, and because the analysis and design of a more
-sophisticated solution was declared out of scope of our project assignment, a
-rather simple scheduling algorithm was chosen. The broker shall maintain a queue
-of available workers. When assigning a job, it traverses this queue and chooses
-the first machine that matches the requirements of the job. This machine is then
-moved to the end of the queue. 
-
-Presented algorithm results in a simple round-robin load balancing strategy,
-which should be sufficient for small-scale deployments (such as a single
-university). However, with a large amount of jobs, some workers will easily
-become overloaded. The implementation must allow for a simple replacement of the
-load balancing strategy so that this problem can be solved in the near future.
-
-#### Forwarding Jobs
-
-Information about a job can be divided in two disjoint parts -- what the worker
-needs to know to process it and what the broker needs to forward it to the
-correct worker. It remains to be decided how this information will be
-transferred to its destination. 
-
-It is technically possible to transfer all the data required by the worker at
-once through the broker. This package could contain submitted files, test
-data, requirements on the worker, etc. A drawback of this solution is that
-both submitted files and test data can be rather large. Furthermore, it is
-likely that test data would be transferred many times.
-
-Because of these facts, we decided to store data required by the worker using a
-shared storage space and only send a link to this data through the broker. This
-approach leads to a more efficient network and resource utilization (the broker
-doesn't have to process data that it doesn't need), but also makes the job
-submission flow more complicated.
-
-#### Further Requirements
-
-The broker can be viewed as a central point of the backend. While it has only
-two primary, closely related responsibilities, other requirements have arisen
-(forwarding messages about job evaluation progress back to the frontend) and
-will arise in the future. To facilitate such requirements, its architecture
-should allow simply adding new communication flows. It should also be as
-asynchronous as possible to enable efficient communication with external
-services, for example via HTTP.
-
-### Worker
-
-Worker is a component which is supposed to execute incoming jobs from broker. As
-such worker should work and support wide range of different infrastructures and
-maybe even platforms/operating systems. Support of at least two main operating
-systems is desirable and should be implemented.
-
-Worker as a service does not have to be very complicated, but a bit of complex
-behaviour is needed. Mentioned complexity is almost exclusively concerned about
-robust communication with broker which has to be regularly checked. Ping
-mechanism is usually used for this in all kind of projects. This means that the
-worker should be able to send ping messages even during execution. So worker has
-to be divided into two separate parts, the one which will handle communication
-with broker and the another which will execute jobs.
-
-The easiest solution is to have these parts in separate threads which somehow
-tightly communicates with each other. For inter process communication there can
-be used numerous technologies, from shared memory to condition variables or some
-kind of in-process messages. The ZeroMQ library which we already use provides
-in-process messages that work on the same principles as network communication,
-which is convenient and solves problems with thread synchronization.
-
-#### Execution of Jobs
-
-At this point we have worker with two internal parts listening one and execution
-one. Implementation of first one is quite straightforward and clear. So let us
-discuss what should be happening in execution subsystem.
-
-After successful arrival of the job from broker to the listening thread, the job
-is immediately redirected to execution thread. In there worker has to prepare
-new execution environment, solution archive has to be downloaded from fileserver
-and extracted. Job configuration is located within these files and loaded into
-internal structures and executed. After that, results are uploaded back to
-fileserver. These steps are the basic ones which are really necessary for whole
-execution and have to be executed in this precise order.
-
-The evaluation unit executed by ReCodEx and job configuration were already
-discussed above. The conclusion was that jobs containing small tasks will be
-used. Particular format of the actual job configuration can be found in 'Job
-configuration' appendix.  Implementation of parsing and storing these data in
-worker is then quite straightforward.
-
-Worker has internal structures to which loads and which stores metadata given in
-configuration. Whole job is mapped to job metadata structure and tasks are
-mapped to either external ones or internal ones (internal commands has to be
-defined within worker), both are different whether they are executed in sandbox
-or as an internal worker commands.
-
-#### Task Execution Failure
-
-Another division of tasks is by task-type field in configuration. This field can
-have four values: initiation, execution, evaluation and inner. All was discussed
-and described above in evaluation unit analysis. What is important to worker is
-how to behave if execution of task with some particular type fails.
-
-There are two possible situations execution fails due to bad user solution or
-due to some internal error. If execution fails on internal error solution cannot
-be declared overly as failed. User should not be punished for bad configuration
-or some network error. This is where task types are useful.
-
-Initiation, execution and evaluation are tasks which are usually executing code
-which was given by users who submitted solution of exercise. If this kinds of
-tasks fail it is probably connected with bad user solution and can be evaluated.
-
-But if some inner task fails solution should be re-executed, in best case
-scenario on different worker. That is why if inner task fails it is sent back to
-broker which will reassign job to another worker. More on this subject should be
-discussed in broker assigning algorithms section.
-
-#### Job Working Directories
-
-There is also question about working directory or directories of job, which
-directories should be used and what for. There is one simple answer on this
-every job will have only one specified directory which will contain every file
-with which worker will work in the scope of whole job execution. This solution
-is easy but fails due to logical and security reasons.
-
-The least which must be done are two folders one for internal temporary files
-and second one for evaluation. The directory for temporary files is enough to
-comprehend all kind of internal work with filesystem but only one directory for
-whole evaluation is somehow not enough.
-
-The solution which was chosen at the end is to have folders for downloaded
-archive, decompressed solution, evaluation directory in which user solution is
-executed and then folders for temporary files and for results and generally
-files which should be uploaded back to fileserver with solution results.
-
-There has to be also hierarchy which separate folders from different workers on
-the same machines. That is why paths to directories are in format:
-`${DEFAULT}/${FOLDER}/${WORKER_ID}/${JOB_ID}` where default means default
-working directory of whole worker, folder is particular directory for some
-purpose (archives, evaluation, ...).
-
-Mentioned division of job directories proved to be flexible and detailed enough,
-everything is in logical units and where it is supposed to be which means that
-searching through this system should be easy. In addition if solutions of users
-have access only to evaluation directory then they do not have access to
-unnecessary files which is better for overall security of whole ReCodEx.
-
-### Sandboxing
-
-There are numerous ways how to approach sandboxing on different platforms,
-describing all possible approaches is out of scope of this document. Instead of
-that have a look at some of the features which are certainly needed for ReCodEx
-and propose some particular sandboxes implementations on Linux or Windows.
-
-General purpose of sandbox is safely execute software in any form, from scripts
-to binaries. Various sandboxes differ in how safely are they and what limiting
-features they have. Ideal situation is that sandbox will have numerous options
-and corresponding features which will allow administrators to setup environment
-as they like and which will not allow user programs to somehow damage executing
-machine in any way possible.
-
-For ReCodEx and its evaluation there is need for at least these features:
-execution time and memory limitation, disk operations limit, disk accessibility
-restrictions and network restrictions. All these features if combined and
-implemented well are giving pretty safe sandbox which can be used for all kinds
-of users solutions and should be able to restrict and stop any standard way of
-attacks or errors.
-
-#### Linux
-
-Linux systems have quite extent support of sandboxing in kernel, there were
-introduced and implemented kernel namespaces and cgroups which combined can
-limit hardware resources (cpu, memory) and separate executing program into its
-own namespace (pid, network). These two features comply sandbox requirement for
-ReCodEx so there were two options, either find existing solution or implement
-new one. Luckily existing solution was found and its name is **isolate**.
-Isolate does not use all possible kernel features but only subset which is still
-enough to be used by ReCodEx.
-
-#### Windows
-
-The opposite situation is in Windows world, there is limited support in its
-kernel which makes sandboxing a bit trickier. Windows kernel only has ways how
-to restrict privileges of a process through restriction of internal access
-tokens. Monitoring of hardware resources is not possible but used resources can
-be obtained through newly created job objects.
-
-There are numerous sandboxes for Windows but they all are focused on different
-things in a lot of cases they serves as safe environment for malicious programs,
-viruses in particular. Or they are designed as a separate filesystem namespace
-for installing a lot of temporarily used programs. From all these we can
-mention: Sandboxie, Comodo Internet Security, Cuckoo sandbox and many others.
-None of these is fitted as sandbox solution for ReCodEx. With this being said we
-can safely state that designing and implementing new general sandbox for Windows
-is out of scope of this project.
-
-But designing sandbox only for specific environment is possible, namely for C#
-and .NET. CLR as a virtual machine and runtime environment has a pretty good
-security support for restrictions and separation which is also transferred to
-C#. This makes it quite easy to implement simple sandbox within C# but there are
-not any well known general purpose implementations.
-
-As mentioned in previous paragraphs implementing our own solution is out of
-scope of project. But C# sandbox is quite good topic for another project for
-example term project for C# course so it might be written and integrated in
-future.
-
-### Fileserver
-
-The fileserver provides access over HTTP to a shared storage space that contains
-files submitted by students, supplementary files such as test inputs and outputs
-and results of evaluation. In other words, it acts as an intermediate storage
-node for data passed between the frontend and the backend. This functionality
-can be easily separated from the rest of the backend features, which led to
-designing the fileserver as a standalone component. Such design helps
-encapsulate the details of how the files are stored (e.g. on a file system, in a
-database or using a cloud storage service), while also making it possible to
-share the storage between multiple ReCodEx frontends.
-
-For early releases of the system, we chose to store all files on the file system
--- it is the least complicated solution (in terms of implementation complexity)
-and the storage backend can be rather easily migrated to a different technology.
-
-One of the facts we learned from CodEx is that many exercises share test input
-and output files, and also that these files can be rather large (hundreds of
-megabytes). A direct consequence of this is that we cannot add these files to
-submission archives that are to be downloaded by workers -- the combined size of
-the archives would quickly exceed gigabytes, which is impractical. Another
-conclusion we made is that a way to deal with duplicate files must be
-introduced.
-
-A simple solution to this problem is storing supplementary files under the
-hashes of their content. This ensures that every file is stored only once. On
-the other hand, it makes it more difficult to understand what the content of a
-file is at a glance, which might prove problematic for the administrator.
-However, human-readable identification is not as important as removing
-duplicates -- administrators rarely need to inspect stored files (and when they
-do, they should know their hashes), but duplicate files occupied a large part of
-the disk space used by CodEx.
-
-A notable part of the work of the fileserver is done by a web server (e.g.
-listening to HTTP requests and caching recently accessed files in memory for
-faster access). What remains to be implemented is handling requests that upload
-files -- student submissions should be stored in archives to facilitate simple
-downloading and supplementary exercise files need to be stored under their
-hashes.
-
-We decided to use Python and the Flask web framework. This combination makes it
-possible to express the logic in ~100 SLOC and also provides means to run the
-fileserver as a standalone service (without a web server), which is useful for
-development.
-
-### Cleaner
-
-Worker can use caching mechanism based on files from fileserver under one
-condition, provided files has to have unique name. This means there has to be
-system which can download file, store it in cache and after some time of
-inactivity delete it. Because there can be multiple worker instances on some
-particular server it is not efficient to have this system in every worker on its
-own. So it is feasible to have this feature somehow shared among all workers on
-the same machine.
-
-Solution may be again having separate service connected through network with
-workers which would provide such functionality, but this would mean component
-with another communication for the purpose, where it is not exactly needed. But
-mainly it would be single-failure component. If it would stop working then it is
-quite a problem.
-
-So there was chosen another solution which assumes worker has access to
-specified cache folder. In there folder worker can download supplementary files
-and copy them from here. This means every worker has the possibility to maintain
-downloads to cache, but what is worker not able to properly do, is deletion of
-unused files after some time.
-
-#### Architecture
-
-For that functionality single-purpose component is introduced which is called
-'cleaner'. It is simple script executed within cron which is able to delete
-files which were unused for some time. Together with worker fetching feature
-cleaner completes particular server specific caching system.
-
-Cleaner as mentioned is simple script which is executed regularly as a cron job.
-If there is caching system like it was introduced in paragraph above there are
-little possibilities how cleaner should be implemented.
-
-On various filesystems there is usually support for two  particular timestamps,
-`last access time` and `last modification time`. Files in cache are once
-downloaded and then just copied, this means that last modification time is set
-only once on creation of file and last access time should be set every time on
-copy. From this we can conclude that last access time is what is needed here.
-
-But unlike last modification time, last access time is not usually enabled on
-conventional filesystems (more on this subject can be found
-[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)).
-So if we choose to use last access time, filesystem used for cache folder has to
-have last access time for files enabled. Last access time was chosen for
-implementation in ReCodEx but this might change in further releases.
-
-However, there is another way, last modification time which is broadly supported
-can be used. But this solution is not automatic and worker would have to 'touch'
-cache files whenever they are accessed. This solution is maybe a bit better than
-the one with last access time and might be implemented in future releases.
-
-#### Caching Flow
-
-Having cleaner as separated component and caching itself handled in worker is
-kind of blurry and is not clearly observable that it works without problems.
-The goal is to have system which can recover from every kind of errors.
-
-Follows description of one possible implementation. This whole mechanism relies
-on worker ability to recover from internal fetch task failure. In case of error
-here job will be reassigned to another worker where problem hopefully does not
-arise.
-
-First start with worker implementation:
-
-- worker discovers fetch task which should download supplementary file
-- worker takes name of file and tries to copy it from cache folder to its
-  working folder
-	- if successful then last access time should be rewritten (by filesystem
-	  itself) and whole operation is done
-	- if not successful then file has to be downloaded
-		- file is downloaded from fileserver to working folder and then
-		  copied to cache
-
-Previous implementation is only within worker, cleaner can anytime intervene and
-delete files. Implementation in cleaner follows:
-
-- cleaner on its start stores current reference timestamp which will be used for
-  comparison and load configuration values of caching folder and maximal file
-  age
-- there is a loop going through all files and even directories in specified
-  cache folder
-    - if difference between last access time and reference timestamp is greater
-      than specified maximal file age, then file or folder is deleted
-
-Previous description implies that there is gap between detection of last access
-time and deleting file within cleaner. In the gap there can be worker which will
-access file and the file is anyway deleted but this is fine, file is deleted but
-worker has it copied. If worker does not copy whole file or even do not start to
-copy it and the file is deleted then copy process will fail. This will cause
-internal task failure which will be handled by reassigning job to another
-worker.
-
-Another problem can be with two workers downloading the same file, but this is
-also not a problem, file is firstly downloaded to working folder and after that
-copied to cache.
-
-And even if something else unexpectedly fails and because of that fetch task
-will fail during execution, even that should be fine as mentioned previously.
-Reassigning of job should be the last salvation in case everything else goes
-wrong.
-
-### Monitor
-
-Users want to view real time evaluation progress of their solution. It can be
-easily done with established double-sided connection stream, but it is hard to
-achieve with plain HTTP. HTTP itself works on a separate request basis with no
-long term connection. The HTML5 specification contains Server-Sent Events - a
-means of sending text messages unidirectionally from an HTTP server to a
-subscribed website. Sadly, it is not supported in Internet Explorer and Edge.
-
-However, there is another widely used technology that can solve this problem --
-the WebSocket protocol. It is more general than necessary (it enables
-bidirectional communication) and requires additional web server configuration,
-but it is supported in recent versions of all major web browsers.
-
-Working with the WebSocket protocol from the backend is possible, but not ideal
-from the design point of view. Backend should be hidden from public internet to
-minimize surface for possible attacks. With this in mind, there are two possible
-options:
-
-- send progress messages through the API
-- make a separate component that forwards progress messages to clients
-
-Both of the two possibilities have their benefits and drawbacks. The first one
-requires no additional component and the API is already publicly visible. On the
-other side, working with WebSockets from PHP is complicated (but it is possible
-with the help of third-party libraries) and embedding this functionality into
-API is not extendable. The second approach is better for future changing the
-protocol or implementing extensions like caching of messages. Also, the progress
-feature is considered only optional, because there may be clients for which this
-feature is useless. Major drawback of separate component is another part, which
-needs to be publicly exposed.
-
-We decided to make a separate component, mainly because it is smaller component
-with only one role, better maintainability and optional demands for progress
-callback.
-
-There are several possibilities how to write the component. Notably, considered
-options were already used languages C++, PHP, JavaScript and Python. At the end,
-the Python language was chosen for its simplicity, great support for all used
-technologies and also there are free Python developers in out team.
-
-### API Server
-
-The API server must handle HTTP requests and manage the state of the application
-in some kind of a database. The API server will be a RESTful service and will
-return data encoded as JSON documents. It must also be able to communicate with
-the backend over ZeroMQ.
-
-We considered several technologies which could be used:
-
-- PHP + Apache -- one of the most widely used technologies for creating web
-  servers. It is a suitable technology for this kind of a project. It has all
-  the features we need when some additional extensions are installed (to support
-  LDAP or ZeroMQ).
-- Ruby on Rails, Python (Django), etc. -- popular web technologies that appeared
-  in the last decade. Both support ZeroMQ and LDAP via extensions and have large
-  developer communities.
-- ASP.NET (C#), JSP (Java) -- these technologies are very robust and are used to
-  create server technologies in many big enterprises. Both can run on Windows
-  and Linux servers (ASP.NET using the .NET Core).
-- JavaScript (Node.js) -- it is a quite new technology and it is being used to
-  create REST APIs lately.  Applications running on Node.js are quite performant
-  and the number of open-source libraries available on the Internet is very
-  huge.
-
-We chose PHP and Apache mainly because we were familiar with these technologies
-and we were able to develop all the features we needed without learning to use a
-new technology. Since the number of features was quite high and needed to meet a
-strict deadline. This does not mean that we would find all the other
-technologies superior to PHP in all other aspects - PHP 7 is a mature language
-with a huge community and a wide range of tools, libraries, and frameworks.
-
-We decided to use an ORM framework to manage the database, namely the widely
-used PHP ORM Doctrine 2. Using an ORM tool means we do not have to write SQL
-queries by hand. Instead, we work with persistent objects, which provides a
-higher level of abstraction. Doctrine also has a robust database abstraction
-layer so the database engine is not very important and it can be changed without
-any need for changing the code. MariaDB was chosen as the storage backend.
-
-To speed up the development process of the PHP server application we decided to
-use a web framework. After evaluating and trying several frameworks, such as
-Lumen, Laravel, and Symfony, we ended up using Nette.
-
-- **Lumen** and **Laravel** seemed promising but the default ORM framework
-  Eloquent is an implementation of ActiveRecord which we wanted to avoid. It
-  was also surprisingly complicated to implement custom middleware for validation
-  of access tokens in the headers of incoming HTTP requests.
-- **Symfony** is a very good framework and has Doctrine "built-in". The reason
-  why we did not use Symfony in the end was our lack of experience with this
-  framework.
-- **Nette framework** is very popular in the Czech Republic -- its lead
-  developer is a well-known Czech programmer David Grudl. We were already
-  familiar with the patterns used in this framework, such as dependency
-  injection, authentication, routing. These concepts are useful even when
-  developing a REST application which might be a surprise considering that
-  Nette focuses on "traditional" web applications. 
-  Nette is inspired by Symfony and many of the Symfony bundles are available
-  as components or extensions for Nette. There is for example a Nette
-  extension which makes integration of Doctrine 2 very straightforward.
-
-#### Architecture of The System
-
-The Nette framework is an MVP (Model, View, Presenter) framework. It has many
-tools for creating complex websites and we need only a subset of them or we use
-different libraries which suite our purposes better:
-
-- **Model** - the model layer is implemented using the Doctrine 2 ORM instead of
-Nette Database
-- **View** - the whole view layer of the Nette framework (e.g., the Latte engine
-used for HTML template rendering) is unnecessary since we will return all the
-responses encoded in JSON. JSON is a common format used in APIs and we decided
-to prefer it to XML or a custom format.
-- **Presenter** - the whole lifecycle of a request processing of the Nette
-framework is used. The Presenters are used to group the logic of the individual
-API endpoints. The routing mechanism is modified to distinguish the actions by
-both the URL and the HTTP method of the request.
-
-#### Authentication
-
-To make certain data and actions accessible only for some specific users, there
-must be a way how these users can prove their identity. We decided to avoid PHP
-sessions to make the server stateless (session ID is stored in the cookies of
-the HTTP requests and responses). The server issues a specific token for the
-user after his/her identity is verified (i.e., by providing email and password)
-and sent to the client in the body of the HTTP response. The client must
-remember this token and attach it to every following request in the
-*Authorization* header.
-
-The token must be valid only for a certain time period ("log out" the user after
-a few hours of inactivity) and it must be protected against abuse (e.g., an
-attacker must not be able to issue a token which will be considered valid by the
-system and using which the attacker could pretend to be a different user). We
-decided to use the JWT standard (the JWS).
-
-The JWT is a base64-encoded string which contains three JSON documents - a
-header, some payload, and a signature. The interesting parts are the payload and
-the signature: the payload can contain any data which can identify the user and
-metadata of the token (i.e., the time when the token was issued, the time of
-expiration). The last part is a digital signature contains a digital signature
-of the header and payload and it ensures that nobody can issue their own token
-and steal the identity of someone. Both of these characteristics give us the
-opportunity to validate the token without storing all of the tokens in the
-database.
-
-To implement JWT in Nette, we have to implement some of its security-related
-interfaces such as IAuthenticator and IUserStorage, which is rather easy thanks
-to the simple authentication flow. Replacing these services in a Nette
-application is also straightforward, thanks to its dependency injection
-container implementation. The encoding and decoding of the tokens itself
-including generating the signature and signature verification is done through a
-widely used third-party library which lowers the risk of having a bug in the
-implementation of this critical security feature.
-
-##### Backend Monitoring
-
-The next thing related to communication with the backend is monitoring its
-current state. This concerns namely which workers are available for processing
-different hardware groups and which languages can be therefore used in
-exercises.
-
-Another step would be the overall backend state like how many jobs were
-processed by some particular worker, workload of the broker and the workers,
-etc. The easiest solution is to manage this information by hand, every instance
-of the API server has to have an administrator which would have to fill them.
-This includes only the currently available workers and runtime
-environments which does not change very often. The real-time statistics of the
-backend cannot be made accessible this way in a reasonable way.
-
-A better solution is to update this information automatically. This can be
-done in two ways:
-
-- It can be provided by the backend on-demand if API needs it
-- The backend will send these information periodically to the API.
-
-Things like currently available workers or runtime environments are better to be
-really up-to-date so this could be provided on-demand if needed. Backend
-statistics are not that necessary and could be updated periodically.
-
-However due to the lack of time automatic monitoring of the backend state will
-not be implemented in the early versions of this project but might be
-implemented in some of the next releases.
-
-### Web Application
-
-The web application ("WebApp") is one of the possible client applications of the
-ReCodEx system. Creating a web application as the first client application has
-several advantages:
-
-- no installation or setup is required on the device of the user
-- works on all platforms including mobile devices
-- when a new version is released, all the clients will use this version without
-  any need for manual installation of the update
-
-One of the downsides is the large number of different web browsers (including
-the older versions of a specific browser) and their different interpretation
-of the code (HTML, CSS, JS). Some features of the latest specifications of HTML5
-are implemented in some browsers which are used by a subset of the Internet
-users. This has to be taken into account when choosing appropriate tools
-for implementation of a website.
-
-There are two basic ways how to create a website these days:
-
-- **server-side approach** - the actions of the user are processed on the server
-  and the HTML code with the results of the action is generated on the server
-  and sent back to the web browser of the user. The client does not handle any
-  logic (apart from rendering of the user interface and some basic user
-  interaction) and is therefore very simple. The server can use the API server
-  for processing of the actions so the business logic of the server can be very
-  simple as well.  A disadvantage of this approach is that a lot of redundant
-  data is transferred across the requests although some parts of the content can
-  be cached (e.g., CSS files). This results in longer loading times of the
-  website.
-- **server-side rendering with asynchronous updates (AJAX)** - a slightly
-  different approach is to render the page on the server as in the previous case
-  but then execute the actions of the user asynchronously using the
-  `XMLHttpRequest` JavaScript functionality. Which creates a HTTP request and
-  transfers only the part of the website which will be updated.
-- **client-side approach** - the opposite approach is to transfer the
-  communication with the API server and the rendering of the HTML completely
-  from the server directly to the client. The client runs the code (usually
-  JavaScript) in his/her web browser and the content of the website is generated
-  based on the data received from the API server. The script file is usually
-  quite large but it can be cached and does not have to be downloaded from the
-  server again (until the cached file expires).  Only the data from the API
-  server needs to be transferred over the Internet and thus reduce the volume of
-  payload on each request which leads to a much more responsive user experience,
-  especially on slower networks. Since the client-side code has full control
-  over the UI and a more sophisticated user interactions with the UI can be
-  achieved.
-
-All of these are used in production by the web developers and all
-of them are well documented and there are mature tools for creating websites
-using any of these approaches.
-
-We decided to use the third approach -- to create a fully client-side
-application which would be familiar and intuitive for a user who is used to
-modern web applications.
-
-#### Used Technologies
-
-We examined several frameworks which are commonly used to speed up the
-development of a web application. There are several open source options
-available with a large number of tools, tutorials, and libraries. From the many
-options (Backbone, Ember, Vue, Cycle.js, ...) there are two main frameworks
-worth considering:
-
-- **Angular 2** - it is a new framework which was developed by Google. This
-  framework is very complex and provides the developer with many tools which
-  make creating a website very straightforward. The code can be written in pure
-  JavaScript (ES5) or using the TypeScript language which is then transpiled
-  into JavaScript. Creating a web application in Angular 2 is based on creating
-  and composing components. The previous version of Angular is not compatible
-  with this new version.
-- **React and Redux** - [React](https://facebook.github.io/react) is a fast
-  library for rendering of the user interface developed by Facebook. It is based
-  on components composition as well. A React application is usually written in
-  EcmaScript 6 and the JSX syntax for defining the component tree. This code is
-  usually transpiled to JavaScript (ES5) using some kind of a transpiler like
-  Babel. [Redux](http://redux.js.org/) is a library for managing the state of
-  the application and it implements a modification of the so-called Flux
-  architecture introduced by Facebook. React and Redux are being used for a
-  longer time than Angular 2 and both are still actively developed. There are
-  many open-source components and addons available for both React and Redux.
-
-We decided to use React and Redux over Angular 2 for several reasons:
-
-- There is a large community around these libraries and there is a large number
-  of tutorials, libraries, and other resources available online.
-- Many of the web frontend developers are familiar with React and Redux and
-  contributing to the project should be easy for them.
-- A stable version of Angular 2 was still not released at the time we started
-  developing the web application.
-- We had previous experience with React and Redux and Angular 2 did not bring
-  any significant improvements and features over React so it would not be worth
-  learning the paradigms of a new framework.
-- It is easy to debug React component tree and Redux state transitions
-  using extensions for Google Chrome and Firefox.
-
-##### Internationalization And Globalization
-
-The user interface must be accessible in multiple languages and should be easily
-translatable into more languages in the future. The most promissing library
-which enables react applications to translate all of the messages of the UI is
-[react-intl](https://github.com/yahoo/react-intl).
-
-A good JavaScript library for manipulation with dates and times is
-[Moment.js](http://momentjs.com). It is used by many open-source react 
-components like date and time pickers.
-
-#### User Interface Design
-
-There is no artist on the team so we had to come up with an idea how to create a
-visually appealing application with this handicap. User interfaces created by
-programmers are notoriously ugly and unintuitive. Luckily we found the
-[AdminLTE](https://almsaeedstudio.com/) theme by Abdullah Almsaeed which is
-built on top of the [Bootstrap framework](http://getbootstrap.com/) by Twitter.
-
-This is a great combination because there is an open-source implementation of
-the Bootstrap components for React and with the stylesheets from AdminLTE the
-application looks good and is distingushable form the many websites using the
-Bootstrap framework with very little work.
-
-# User Documentation
-
-Users interact with the ReCodEx through the web application. It is required to
-use a modern web browser with good HTML5 and CSS3 support. Among others, cookies
-and local storage are used. Also a decent JavaScript runtime must be provided by
-the browser.
-
-Supported and tested browsers are: Firefox 50+, Chrome 55+, Opera 42+ and Edge
-13+. Mobile devices often have problems with internationalization and possibly
-lack support for some common features of desktop browsers. In this stage of
-development is not possible for us to fine tune the interface for major mobile
-browsers on all mobile platforms. However, it is confirmed to work with latest
-Google Chrome and Gello browser on Android 7.1+. Issues have been reported with
-Firefox that will be fixed in the future. Also, it is confirmed to work with
-Safari browser on iOS 10.
-
-Usage of the web application is divided into sections concerning particular user
-roles. Under these sections all possible use cases can be found. These sections
-are inclusive, so more privileged users need to read instructions for all less
-privileged users. Described roles are:
-
-- Student
-- Group supervisor
-- Group administrator
-- Instance administrator
-- Superadministrator
-
-## Terminology
-
-**Instance** -- Represents a university, company or some other organization
- unit. Multiple instances can exist in a single ReCodEx installation.
-
-**Group** -- A group of students to which exercises are assigned by a
- supervisor. It should typically correspond with a real world lab group.
-
-**User** -- A person that interacts with the system using the web interface (or
- an alternative client).
-
-**Student** -- A user with least privileges who is subscribed to some groups and
- submits solutions to exercise assignments.
-
-**Supervisor** -- A person responsible for assigning exercises to a group and
- reviewing submissions.
-
-**Admin** -- A person responsible for the maintenance of the system and fixing
- problems supervisors cannot solve.
-
-**Exercise** -- An algorithmic problem that can be assigned to a group.  They
- can be shared by the teachers using an exercise database in ReCodEx.
-
-**Assignment** -- An exercise assigned to a group, possibly with modifications.
-
-**Runtime environment** -- Runtime environment is unique combination of platform
- (OS) and programming language runtime/compiler in specific version. Runtime
- environments are managed by the administrators to reflect abilities of whole
- system.
-
-**Hardware group** -- Hardware group is a set of workers with similar hardware.
- Its purpose is to group workers that are likely to run a program using the same
- amount of resources. Hardware groups are managed byt the system administrators
- who have to keep them up-to-date.
-
-## General Basics
-
-Description of general basics which are the same for all users of ReCodEx web
-application follows.
-
-### First Steps in ReCodEx
-
-You can create an account by clicking the "Create account" menu item in the left
-sidebar. You can choose between two types of registration methods -- by creating
-a local account with a specific password, or pairing your new account with an
-existing CAS UK account.
-
-If you decide to create a new local account using the "Create ReCodEx account”
-form, you will have to provide your details and choose a password for your
-account. Although ReCodEx allows using quite weak passwords, it is wise to use a
-bit stronger ones The actual strength is shown in progress bar near the password
-field during registration. You will later sign in using your email address as
-your username and the password you select.
-
-If you decide to use the CAS UK service, then ReCodEx will verify your CAS
-credentials and create a new account based on information stored there (name and
-email address). You can change your personal information later on the
-"Settings" page.
-
-Regardless of the desired account type, an instance it will belong to must be
-selected. The instance will be most likely your university or other organization
-you are a member of.
-
-To log in, go to the homepage of ReCodEx and in the left sidebar choose the menu
-item "Sign in". Then you must enter your credentials into one of the two forms
--- if you selected a password during registration, then you should sign with
-your email and password in the first form called "Sign into ReCodEx". If you
-registered using the Charles University Authentication Service (CAS), you should
-put your student’s number and your CAS password into the second form called
-"Sign into ReCodEx using CAS UK".
-
-There are several options you can edit in your user account:
-
-- changing your personal information (i.e., name)
-- changing your credentials (email and password)
-- updating your preferences (source code viewer/editor settings, default
-  language)
-
-You can access the settings page through the "Settings" button right under
-your name in the left sidebar.
-
-If you are not active in ReCodEx for a whole day, you will be logged out
-automatically. However, we recommend you sign out of the application after you
-finish your interaction with it. The logout button is placed in the top section
-of the left sidebar right under your name. You may need to expand the sidebar
-with a button next to the "ReCodEx” title (informally known as _hamburger
-button_), depending on your screen size.
-
-### Forgotten Password
-
-If you cannot remember your password and you do not use CAS UK authentication,
-then you can reset your password. You will find a link saying "Cannot remember
-what your password was? Reset your password." under the sign in form. After you
-click this link, you will be asked to submit your registration email address. A
-message with a link containing a special token will be sent to you by e-mail --
-we make sure that the person who requested password resetting is really you.
-When you visit the link, you will be able to enter a new password for your
-account. The token is valid only for a couple of minutes, so do not forget to
-reset the password as soon as possible, or you will have to request a new link
-with a valid token.
-
-If you sign in through CAS UK, then please follow the instructions
-provided by the administrators of the service described on their
-website.
-
-### Dashboard
-
-When you log into the system you should be redirected to your "Dashboard". On
-this page you can see some brief information about the groups you are member of.
-The information presented there varies with your role in the system -- further
-description of dashboard will be provided later on with according roles.
-
-## Student
-
-Student is a default role for every newly registered user. This role has quite
-limited capabilities in ReCodEx. Generally, a student can only submit solutions
-of exercises in some particular groups. These groups should correspond to
-courses he/she attends.
-
-On the "Dashboard" page there is "Groups you are student of" section where you
-can find list of your student groups. In first column of every row there is a
-brief panel describing concerning group. There is name of the group and
-percentage of gained points from course. If you have enough points to
-successfully complete the course then this panel has green background with tick
-sign. In the second column there is a list of assigned exercises with its
-deadlines. If you want to quickly get to the groups page you might want to use
-provided "Show group's detail" button.
-
-### Join Group
-
-To be able to submit solutions you have to be a member of the right group. Each
-instance has its own group hierarchy, so you can choose only those within your
-instance. That is why a list of groups is available from under an instance link
-located in the sidebar. This link brings you to instance detail page.
-
-In there you can see a description of the instance and most importantly in
-"Groups hierarchy" box there is a hierarchical list of all public groups in the
-instance. Please note that groups with plus sign are collapsible and can be
-further extended. When you find a group you would like to join, continue by
-clicking on "See group's page" link following with "Join group" link.
-
-**Note:** Some groups can be marked as private and these groups are not visible
- in hierarchy and membership cannot be established by students themselves.
- Management of students in this type of groups is in the hands of supervisors.
-
-On the group detail page there are multiple interesting things for you. The
-first one is a brief overview containing the information describing the group,
-there is a list of supervisors and also the hierarchy of the subgroups. The most
-important section is the "Student's dashboard" section. This section contains
-the list of assignments and the list of fellow students. If the supervisors of
-the group allowed students to see the statistic of their fellow students then
-there will also be the number of points each of the students has gained so far.
-
-### Start Solving Assignments
-
-In the "Assignments" box on the group detail page there is a list of assigned
-exercises which students are supposed to solve. The assignments are displayed
-with their names and deadlines. There are possibly two deadlines, the first one
-means that till this datetime student will receive full amount of points in case
-of successful solution. Second deadline does not have to be set, but in case it
-is, the maximum number of points for successful solution between these two
-deadlines can be different.
-
-An assignment link will lead you to assignment detail page where are presented
-all known details about assignment. There are of course both deadlines, limit of
-submissions which you can make and also full-range description of assignment,
-which can be localized. The localization can be on demand switched between all
-language variants in tab like box.
-
-Further on the page you can find "Submitted solutions" box where is a list of
-submissions with links to result details. But most importantly there is a
-"Submit new solution" button on the assignment page which provides an interface
-to submit solution of the assignment.
-
-After clicking on submit button, dialog window will show up. In here you can
-upload files representing your solution, you can even add some notes to mark the
-solution. Your supervisor can also access this note. After you successfully
-upload all files necessary for your solution, click the "Submit your solution"
-button and let ReCodEx evaluate the solution.
-
-During the execution ReCodEx backend might send evaluation progress state to
-your browser which will be displayed in another dialog window. When the whole
-execution is finished then a "See the results" button will appear and you can
-look at the results of your solution.
-
-### View Results of Submission
-
-On the results detail page there are a lot of information. Apart from assignment
-description, which is not connected to your results, there is also the solution
-submitter name (supervisor can submit a solution on your behalf), further there
-are files which were uploaded on submission and most importantly "Evaluation
-details" and "Test results" boxes.
-
-Evaluation details contains overall results of your solution. There are
-information such as whether the solution was provided before deadlines, if the
-evaluation process successfully finished or if compilation succeeded. After that
-you can find a lot of values, most important one is the last, "Total score",
-consisting of your score, slash and the maximum number of points for this
-assignment. Interestingly the your score value can be higher than the maximum,
-which is caused by "Bonus points" item above. If your solution is nice and
-supervisor notices it, he/she can assign you additional points for effort. On
-the other hand, points can be also subtracted for bad coding habits or even
-cheating.
-
-In test results box there is a table of all exercise tests results. Columns
-represents these information:
-
-- test case overall result, symbol of yes/no option
-- test case name
-- percentage of correctness of this particular test
-- evaluation status, if test was successfully executed or failed
-- memory limit, if supervisor allowed it then percentual memory usage is
-  displayed
-- time limit, if supervisor allowed it then percentual time usage is displayed
-
-A new feature of web application is "Comments and notes" box where you can
-communicate with your supervisors or just write random private notes to your
-submission. Adding a note is quite simple, you just write it to text field in
-the bottom of box and click on the "Send" button. The button with lock image
-underneath can switch visibility of newly created comments.
-
-In case you think the ReCodEx evaluation of your solution is wrong, please use
-the comments system described above, or even better notify your supervisor by
-another channel (email). Unfortunately there is currently no notification
-mechanism for new comment messages.
-
-
-## Group Supervisor
-
-Group supervisor is typically the lecturer of a course. A user in this role can
-modify group description and properties, assign exercises or manage list of
-students. Further permissions like managing subgroups or supervisors is
-available only for group administrators.
-
-On "Dashboard" page you can find "Groups you supervise" section. Here there are
-boxes representing your groups with the list of students attending course and
-their points. Student names are clickable with redirection to the profile of the
-user where further information about his/hers assignments and solution can be
-found. To quickly jump onto groups page, use "Show group's detail" button at
-the bottom of the matching group box.
-
-### Manage Group
-
-Locate group you supervise and you want to manage. All your supervised groups
-are available in sidebar under "Groups -- supervisor" collapsible menu. If you
-click on one of those you will be redirected to group detail page. In addition
-to basic group information you can also see "Supervisor's controls" section. In
-this section there are lists of current students and assignments.
-
-As a supervisor of group you are able to see "Edit group settings" button
-at the top of the page. Following this link will take you to group editation
-page with form containing these fields:
-
-- group name which is visible to other users
-- external identification which may be used for pairing with entries in an
-  information system
-- description of group which will be available to users in instance (in
-  Markdown)
-- set if group is publicly visible (and joinable by students) or private
-- options to set if students should be able see statistics of each other
-- minimal points threshold which students have to gain to successfully complete
-  the course
-
-After filling all necessary fields the form can be sent by clicking on "Edit
-group" button and all changes will be applied.
-
-For students management there are "Students" and "Add student" boxes. The first
-one is simple list of all students which are attending the course with the
-possibility of delete them from the group. That can be done by hitting "Leave
-group" button near particular user. The second box is for adding students to the
-group. There is a text field for typing name of the student and after clicking
-on the magnifier image or pressing enter key there will appear list of matched
-users. At this moment just click on the "Join group" button and student will be
-signed in to your group.
-
-### Assigning Exercises
-
-Before assigning an exercise, you obviously have to know what exercises are
-available. A list of all exercises in the system can be found under "Exercises"
-link in sidebar. This page contains a table with exercises names, difficulties
-and names of the exercise authors. Further information about exercise is
-available by clicking on its name.
-
-On the exercise details page are numerous information about it. There is a box
-with all possible localized descriptions and also a box with some additional
-information of exercise author, its difficulty, version, etc. There is also a
-description for supervisors by exercise author under "Exercise overview" option,
-where some important information can be found. And most notably there is an
-information about available programming languages for this exercise, under
-"Supported runtime environments" section.
-
-If you decide that the exercise is suitable for one of your groups, look for the
-"Groups" box at the bottom of the page. There is a list of all groups you
-supervise with an "Assign" button which will assign the exercise to the
-selected group.
-
-After clicking on the "Assign" button you should be redirected to assignment
-editation page. In there you can find two forms, one for editation of assignment
-meta information and the second one for setting exercise time and memory limits.
-
-In meta information form you can fill these options:
-
-- name of the assignment which will be visible in a group
-- visibility (if an assignment is under construction then you can mark it as not
-  visible and students will not see it)
-- subform for localized descriptions (new localization can be added by clicking
-  on "Add language variant" button, current one can be deleted with "Remove this
-  language" button)
-    - language of description from dropdown field (English, Czech, German)
-    - description in selected language
-- score configuration which will be used on students solution evaluation, you
-  can find some very simple one already in here, description of score
-  configuration can be found further in "Writing score configuration" chapter
-- first submission deadline
-- maximum points that can be gained before the first deadline; if you want to
-  manage all points manually, set it to 0 and then use bonus points, which are
-  described in the next subchapter
-- second submission deadline, after that students still can submit exercises but
-  they are given no points no points (must be after the first deadline)
-- maximum points that can be gained between first deadline and second deadline
-- submission count limit for the solutions of the students -- this limits the amount of
-  attempts a student has at solving the problem
-- visibility of memory and time ratios; if true students can see the percentage
-  of used memory and time (with respect to the limit) for each test
-- minimum percentage of points which each submission must gain to be considered
-  correct (if it gets less, it will gain no points)
-- whether the assignment is marked as bonus one and points from solving it are
-  not included into group threshold limit (that means solving it can get you
-  additional points over the limit)
-
-The form has to be submitted with "Edit settings" button otherwise changes will
-not be saved.
-
-The same editation page serves also for the purpose of assignment editation, not
-only creation. That is why on bottom of the page "Delete the assignment" box
-can be found. Clearly the button "Delete" in there can be used to unassign
-exercise from group.
-
-The last unexplored area is the time and memory limits form. The whole form is
-situated in a box with tabs which are leading to particular runtime
-environments. If you wish not to use one of those, locate "Remove" button at the
-bottom of the box tab which will delete this environment from the assignment.
-Please note that this action is irreversible.
-
-In general, every tab in environments box contains some basic information about
-runtime environment and another nested tabbed box. In there you can find all
-hardware groups which are available for the exercise and set limits for all test
-cases. The time limits have to be filled in seconds (float), memory limits are
-in bytes (int). If you are interested in some reference values to particular
-test case then you can take a peek on collapsible "Reference solutions'
-evaluations" items. If you are satisfied with changes you made to the limits,
-save the form with "Change limits" button right under environments box.
-
-### Management of The Solutions of The Students
-
-One of the most important tasks of a group supervisor is checking student
-solutions. As automatic evaluation of them cannot catch all problems in the
-source code, it is advisable to do a brief manual review of the coding
-style of the student and reflect that in assignment bonus points.
-
-On "Assignment detail" page there is an "View student results" button near top
-of the page (next to "Edit assignment settings" button). This will redirect you
-to a page where is a list of boxes, one box per student. Each student box
-contains a list of submissions for this assignment. The row structure of
-submission list is the same as the structure in the "Submitted solution"
-box. More information about every solution can be shown by clicking on "Show
-details" link on the end of solution row.
-
-This page is the same as for students with one exception -- there is an
-additional collapsed box "Set bonus points". In unfolded state, there is an
-input field for one number (positive or negative integer) and confirmation
-button "Set bonus points". After filling intended amount of points and
-submitting the form, the data in "Evaluation details" box get immediately
-updated. To remove assigned bonus points, submit just the zero number. The bonus
-points are not additive, newer value overrides older values.
-
-It is useful to give a feedback about the solution back to the user. For this
-you can use the "Comments and notes" box. Make sure that the messages are not
-private, so that the student can see them. More detailed description of this box
-can be nicely used the "Comments and notes" box. Make sure that the messages are
-not private, so the student can see them. More detailed description of this box
-is available in student part of user documentation.
-
-One of the discussed concept was marking one solution as accepted. However, due
-to lack of frontend developers it is not yet prepared in user interface. We
-hope, it will be ready as soon as possible. The button for accepting a solution
-will be most probably also on this page.
-
-### Creating Exercises
-
-Link to exercise creation can be found in exercises list which is accessible
-through "Exercises" link in sidebar. On the bottom of the exercises list page
-you can find "Add exercise" button which will redirect you to exercise editation
-page. In this moment exercise is already created so if you just leave this page
-exercise will stay in the database. This is also reason why exercise creation
-form is the same as the exercise editation form.
-
-Exercise editation page is divided into three separate forms. First one is
-supposed to contain meta information about exercise, second one is used for
-uploading and management of supplementary files and third one manages runtime
-configuration in which exercise can be executed.
-
-First form is located in "Edit exercise settings" and generally contains meta
-information needed by frontend which are somehow somewhere visible. In here you
-can define:
-
-- exercise name which will be visible to other supervisors
-- difficulty of exercise (easy, medium, hard)
-- description which will be available only for visitors, may be used for further
-  description of exercise (for example information about test cases and how they
-  could be scored)
-- private/public switch, if exercise is private then only you as author can see
-  it, assign it or modify it
-- subform containing localized descriptions of exercise, new one can be added
-  with "Add language variant" button and current one deleted with "Remove this
-  language"
-    - language in which this particular description is in (Czech, English,
-      German)
-    - actual localized description of exercise
-
-After all information is properly set form has to be submitted with "Edit
-settings" button.
-
-Management of supplementary files can be found in "Supplementary files" box.
-Supplementary files are files which you can use further in job configurations
-which have to be provided in all runtime configurations. These files are
-uploaded directly to fileserver from where worker can download them and use
-during execution according to job configuration.
-
-Files can be uploaded either by drag and drop mechanism or by standard "Add a
-file" button. In opened dialog window choose file which should be uploaded. All
-chosen files are immediately uploaded to server but to save supplementary files
-list you have to hit "Save supplementary files" button. All previously uploaded
-files are visible right under drag and drop area, please note that files are
-stored on fileserver and cannot be deleted after upload.
-
-The last form on exercise editation page is runtime configurations editation
-form. Exercise can have multiple runtime configurations according to the number
-of programming languages in which it can be run. Every runtime configuration
-corresponds to one programming language because all of them has to have a bit
-different job configuration.
-
-New runtime configuration can be added with "Add new runtime configuration"
-button this will spawn new tab in runtime configurations box. In here you can
-fill following:
-
-- human readable identifier of runtime configuration
-- runtime environment which corresponds to programming language
-- job configuration in YAML, detailed description of job configuration can be
-  found further in this chapter in "Writing job configuration" section
-
-If you are done with changes to runtime configurations save form with "Change
-runtime configurations" button. If you want to delete some particular runtime
-just hit "Remove" button in the right tab, please note that after this operation
-runtime configurations form has to be again saved to apply changes.
-
-All runtime configurations which were added to exercise will be visible to
-supervisors and all can be used in assignment, so please be sure that all of the
-languages and job configurations are working.
-
-If you choose to delete exercise, at the bottom of the exercise editation page
-you can find "Delete the exercise" box where "Delete" button is located. By
-clicking on it exercise will be delete from the exercises list and will no
-longer be available.
-
-### Reference Solutions of An Exercise
-
-Each exercise should have a set of reference solutions, which are used to tune
-time and memory limits of assignments. Values of used time and memory for each
-solution are displayed in yellow boxes under forms for setting assignment limits
-as described earlier.
-
-However, there is currently no user interface to upload and evaluate reference
-solutions. It is possible to use direct REST API calls, but it is not much user
-friendly. If you are interested, please look at [API
-documentation](https://recodex.github.io/api/), notably sections
-_Uploaded-Files_ and _Reference-Exercise-Solutions_. You need to upload the
-reference solution files, create a new reference solution and then evaluate the
-solution. After that, measured data will be available in the box at assignment
-editing page (setting limits section).
-
-We are now working on a better user interface, which will be available soon.
-Then its description will be added here.
-
-
-## Group Administrator
-
-Group administrator is the group supervisor with some additional permissions in
-particular group. Namely group administrator is capable of creating a subgroups
-in managed group and also adding and deleting supervisors. Administrator of the
-particular group can be only one person.
-
-### Creating Subgroups And Managing Supervisors
-
-There is no special link which will get you to groups in which you are
-administrator. So you have to get there through "Groups - supervisor" link in
-sidebar and choose the right group detail page. If you are there you can see
-"Administrator controls" section, here you can either add supervisor to group or
-create new subgroup.
-
-Form for creating a subgroup is present right on the group detail page in "Add
-subgroup" box. Group can be created with following options:
-
-- name which will be visible in group hierarchy
-- external identification, can be for instance ID of group from school system
-- some brief description about group
-- allow or deny users to see each others statistics from assignments
-
-After filling all the information a group can be created by clicking on "Create
-new group" button. If creation is successful then the group is visible in
-"Groups hierarchy" box on the top of page. All information filled during
-creation can be later modified.
-
-Adding a supervisor to a group is rather easy, on group detail page is an "Add
-supervisor" box which contains text field. In there you can type name or
-username of any user from system. After filling user name, click on the
-magnifier image or press the enter key and all suitable users are searched. If
-your chosen supervisor is in the updated list then just click on the "Make
-supervisor" button and new supervisor should be successfully set.
-
-Also, existing supervisor can be removed from the group. On the group detail
-page there is "Supervisors" box in which all supervisors of the group are
-visible. If you are the group administrator, you can see there "Remove
-supervisor" buttons right next to supervisors names. After clicking on it some
-particular supervisor should not to be supervisor of the group anymore.
-
-
-## Instance Administrator
-
-Instance administrator can be only one person per instance. In addition to
-previous roles this administrator should be able to modify the instance details,
-manage licences and take care of top level groups which belong to the instance.
-
-### Instance Management
-
-List of all instances in the system can be found under "Instances" link in the
-sidebar. On that page there is a table of instances with their respective
-admins. If you are one of them, you can visit its page by clicking on the
-instance name. On the instance details page you can find a description of the
-instance, current groups hierarchy and a form for creating a new group.
-
-If you want to change some of the instance settings, follow "Edit instance" link
-on the instance details page. This will take you to the instance editation page
-with corresponding form. In there you can fill following information:
-
-- name of the instance which will be visible to every other user
-- brief description of instance and for whom it is intended
-- checkbox if instance is open or not which means public or private (hidden from
-  potential users)
-
-If you are done with your editation, save filled information by clicking on
-"Update instance" button.
-
-If you go back to the instance details page you can find there a "Create new
-group" box which is able to add a group to the instance. This form is the same
-as the one for creating subgroup in already existing group so we can skip
-description of the form fields. After successful creation of the group it will
-appear in "Groups hierarchy" box at the top of the page.
-
-### Licences
-
-On the instance details page, there is a box "Licences". On the first line, it
-shows it this instance has currently valid licence or not. Then, there are
-multiple lines with all licences assigned to this instance. Each line consists
-of a note, validity status (if it is valid or revoked by superadministrator) and
-the last date of licence validity.
-
-A box "Add new licence" is used for creating new licences. Required fields are
-the note and the last day of validity. It is not possible to extend licence
-lifetime, a new one should be generated instead. It is possible to have more
-than one valid licence at a time. Currently there is no user interface for
-revoking licences, this is done manually by superadministrator. If an instance
-is to be disabled, all valid licences have to be revoked.
-
-
-## Superadministrator
-
-Superadministrator is a user with the most privileges and as such superadmin
-should be quite a unique role. Ideally, there should be only one user of this
-kind, used with special caution and adequate security. With this stated it is
-obvious that superadmin can perform any action the API is capable of.
-
-### Management of Users
-
-There are only a few user roles in ReCodEx. Basically there are only three:
-_student_, _supervisor_, and _superadmin_. Base role is student which is
-assigned to every registered user. Roles are stored in database alongside other
-information about user. One user always has only one role at the time. At first
-startup of ReCodEx, the administrator has to change the role for his/her account
-manually in the database. After that manual intervention into database should
-never be needed.
-
-There is a little catch in groups and instances management. Groups can have
-admins and supervisors. This setting is valid only per one particular group and
-has to be separated from basic role system. This implies that supervisor in one
-group can be student in another and simultaneously have global supervisor role.
-Changing role from student to supervisor and back is done automatically when the
-new privileges are granted to the user, so managing roles by hand in database is
-not needed. Previously stated information can be applied to instances as well,
-but instances can only have admins.
-
-Roles description:
-
-- Student -- Default role which is used for newly created accounts. Student can
-  join or leave public groups and submit solutions of assigned exercises.
-- Supervisor -- Inherits all permissions from student role. Can manage groups to
-  which he/she belongs to. Supervisor can also view and change groups details,
-  manage assigned exercises, view students in group and their solutions for
-  assigned exercises. On top of that supervisor can create/delete groups too,
-  but only as subgroup of groups he/she belongs to.
-- Superadmin -- Inherits all permissions from supervisor role. Most powerful
-  user in ReCodEx who should be able to do access any functionality provided by
-  the application.
-
-
-## Writing Score Configuration
-
-An important thing about assignment is how to assign points to particular
-solutions. As mentioned previously, the whole job is composed of logical tests.
-All of these tests have to contain one essential "evaluation" task. Evaluation
-task should output one float number which can be further used for scoring of
-particular tests.
-
-Total resulting score of the students solution is then calculated according to a
-supplied score config (described below) and using specified calculator. Total
-score is also a float between 0 and 1. This number is then multiplied by the
-maximum of points awarded for the assignment by the teacher assigning the
-exercise -- not the exercise author.
-
-For now, there is only one way how to write score configuration using only
-simple score calculator. But the implementation in API is agile enough to handle
-upcoming score calculators which might use some more complex scoring algorithms.
-This also means that future calculators do not have to use the YAML format for
-configuration. In fact, the configuration can be a string in any format.
-
-### Simple Score Calculation
-
-First implemented calculator is simple score calculator with test weights. This
-calculator just looks at the score of each test and put them together according
-to the test weights specified in assignment configuration. Resulting score is
-calculated as a sum of products of score and weight of each test divided by the
-sum of all weights. The algorithm in Python would look something like this:
-
-```
-sum = 0
-weightSum = 0
-for t in tests:
-  sum += t.score * t.weight
-  weightSum += t.weight
-score = sum / weightSum
-```
-
-Sample score config in YAML format:
-
-```{.yml}
-testWeights:
-  a: 300   # test with id 'a' has a weight of 300
-  b: 200
-  c: 100
-  d: 100
-```
-
-
-## Writing Job Configuration
-
-To run and evaluate an exercise the backend needs to know the steps how to do
-that. This is different for each environment (operation system, programming
-language, etc.), so each of the environments needs to have separate
-configuration.
-
-Backend works with a powerful, but quite low level description of simple
-connected tasks written in YAML syntax. More about the syntax and general task
-overview can be found on [separate
-page](https://github.com/ReCodEx/wiki/wiki/Assignments). One of the planned
-features was user friendly configuration editor, but due to tight deadline and
-team composition it did not make it to the first release. However, writing
-configuration in the basic format will be always available and allows users to 
-use the full expressive power of the system.
-
-This section walks through creation of job configuration for _hello world_
-exercise. The goal is to compile file _source.c_ and check if it prints `Hello
-World!` to the standard output. This is the only test case named **A**.
-
-The problem can be split into several tasks:
-
-- compile _source.c_ into _helloworld_ with `/usr/bin/gcc`
-- run _helloworld_ and save standard output into _out.txt_
-- fetch predefined output (suppose it is already uploaded to fileserver) with
-  hash `a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b` to _reference.txt_
-- compare _out.txt_ and _reference.txt_ by `/usr/bin/diff`
-
-The absolute path of tools can be obtained from system administrator. However,
-`/usr/bin/gcc` is location, where the GCC binary is available almost everywhere,
-so location of some tools can be (professionally) guessed.
-
-First, write header of the job to the configuration file. 
-
-```{.yml}
-submission:
-    job-id: hello-word-job
-    hw-groups:
-        - group1
-```
-
-Basically it means, that the job _hello-world-job_ needs to be run on workers
-that belong to the `group_1` hardware group . Reference files are downloaded
-from the default location configured in API (such as
-`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job
-execution log will not be saved to result archive.
-
-Next the tasks have to be constructed under _tasks_ section. In this demo job,
-every task depends only on previous one. The first task has input file
-_source.c_ (if submitted by user) already available in working directory, so
-just call the GCC. Compilation is run in sandbox as any other external program
-and should have relaxed time and memory limits. In this scenario, worker
-defaults are used. If compilation fails, the whole job is immediately terminated
-(because the _fatal-failure_ bit is set). Because _bound-directories_ option in 
-sandbox limits section is mostly shared between all tasks, it can be set in 
-worker configuration instead of job configuration (suppose this for following 
-tasks). For configuration of workers please contact your administrator.
-
-```{.yml}
-- task-id: "compilation"
-  type: "initiation"
-  fatal-failure: true
-  cmd:
-      bin: "/usr/bin/gcc"
-      args:
-          - "source.c"
-          - "-o"
-          - "helloworld"
-  sandbox:
-      name: "isolate"
-      limits:
-          - hw-group-id: group1
-            chdir: ${EVAL_DIR}
-            bound-directories:
-                - src: ${SOURCE_DIR}
-                  dst: ${EVAL_DIR}
-                  mode: RW
-```
-
-The compiled program is executed with time and memory limit set and the standard
-output is redirected to a file. This task depends on _compilation_ task, because
-the program cannot be executed without being compiled first. It is important to
-mark this task with _execution_ type, so exceeded limits will be reported in
-frontend.
-
-Time and memory limits set directly for a task have higher priority than worker
-defaults. One important constraint is, that these limits cannot exceed limits
-set by workers. Worker defaults are present as a safety measure so that a 
-malformed job configuration cannot block the worker forever. Worker default 
-limits should be reasonably high, like a gigabyte of memory and several hours of 
-execution time. For exact numbers please contact your administrator.
-
-It is important to know that if the output of a program (both standard and 
-error) is redirected to a file, the sandbox disk quotas apply to that file, as 
-well as the files created directly by the program. In case the outputs are 
-ignored, they are redirected to `/dev/null`, which means there is no limit on 
-the output length (as long as the printing fits in the time limit).
-
-```{.yml}
-- task-id: "execution_1"
-  test-id: "A"
-  type: "execution"
-  dependencies:
-      - compilation
-  cmd:
-      bin: "helloworld"
-  sandbox:
-      name: "isolate"
-      stdout: ${EVAL_DIR}/out.txt
-      limits:
-          - hw-group-id: group1
-            chdir: ${EVAL_DIR}
-            bound-directories:
-                - src: ${SOURCE_DIR}
-                  dst: ${EVAL_DIR}
-                  mode: RW
-            time: 0.5
-            memory: 8192
-```
-
-Fetch sample solution from file server. Base URL of file server is in the header 
-of the job configuration, so only the name of required file (its `sha1sum` in 
-our case) is necessary.
-
-```{.yml}
-- task-id: "fetch_solution_1"
-  test-id: "A"
-  dependencies:
-      - execution
-  cmd:
-      bin: "fetch"
-      args:
-          - "a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"
-          - "${SOURCE_DIR}/reference.txt"
-```
-
-Comparison of results is quite straightforward. It is important to set the task
-type to _evaluation_, so that the return code is set to 0 if the program is
-correct and 1 otherwise. We do not set our own limits, so the default limits are
-used.
-
-```{.yml}
-- task-id: "judge_1"
-  test-id: "A"
-  type: "evaluation"
-  dependencies:
-      - fetch_solution_1
-  cmd:
-      bin: "/usr/bin/diff"
-      args:
-          - "out.txt"
-          - "reference.txt"
-  sandbox:
-      name: "isolate"
-      limits:
-          - hw-group-id: group1
-            chdir: ${EVAL_DIR}
-            bound-directories:
-                - src: ${SOURCE_DIR}
-                  dst: ${EVAL_DIR}
-                  mode: RW
-```
-
-
-# Implementation
-
-## Broker
-
-The broker is a central part of the ReCodEx backend that directs most of the
-communication. It was designed to maintain a heavy load of messages by making
-only small actions in the main communication thread and asynchronous execution
-of other actions.
-
-The responsibilities of broker are:
-
-- allowing workers to register themselves and keep track of their capabilities
-- tracking status of each worker and handle cases when they crash
-- accepting assignment evaluation requests from the frontend and forwarding them
-  to workers
-- receiving a job status information from workers and forward them to the
-  frontend either via monitor or REST API
-- notifying the frontend on errors of the backend
-
-### Internal Structure
-
-The main work of the broker is to handle incoming messages. For that a _reactor_
-subcomponent is written to bind events on sockets to handler classes.  There are
-currently two handlers -- one that handles the main functionality and the other
-that sends status reports to the REST API asynchronously. This prevents broker
-freezes when synchronously waiting for responses of HTTP requests, especially
-when some kind of error happens on the server.
-
-Main handler takes care of requests from workers and API servers:
-
-- *init* -- initial connection from worker to broker
-- *done* -- currently processed job on worker was executed and is done
-- *ping* -- worker proving that it is still alive
-- *progress* -- job progress state from worker which is immediately forwarded
-  to monitor
-- *eval* -- request from API server to execute given job
-
-Second handler is asynchronous status notifier which is able to execute HTTP
-requests. This notifier is used on error reporting from backend to frontend API.
-
-#### Worker Registry
-
-The `worker_registry` class is used to store information about workers, their
-status and the jobs in their queue. It can look up a worker using the headers
-received with a request (a worker is considered suitable if and only if it
-satisfies all the job headers). The headers are arbitrary key-value pairs, which
-are checked for equality by the broker. However, some headers require special
-handling, namely `threads`, for which we check if the value in the request is
-lesser than or equal to the value advertised by the worker, and `hwgroup`, for
-which we support requesting one of multiple hardware groups by listing multiple
-names separated with a `|` symbol (e.g. `group_1|group_2|group_3`.
-
-The registry also implements a basic load balancing algorithm -- the workers are
-contained in a queue and whenever one of them receives a job, it is moved to its
-end, which makes it less likely to receive another job soon.
-
-When a worker is assigned a job, it will not be assigned another one until a
-`done` message is received.
-
-#### Error Reporting
-
-Broker is the only backend component which is able to report errors directly to
-the REST API. Other components have to notify the broker first and it forwards
-the messages to the API. For HTTP communication a *libcurl* library is used. To
-address security concerns there is a *HTTP Basic Auth* configured on particular
-API endpoints and correct credentials have to be entered.
-
-Following types of failures are distinguished:
-
-**Job failure** -- there are two ways a job can fail, internal and external one.
- An internal failure is the fault of worker, for example when it
- cannot download a file needed for the evaluation. An external
- error is for example when the job configuration is malformed. Note that wrong
- student solution is not considered as a job failure.
-
- Jobs that failed internally are reassigned until a limit on the amount of
- reassignments (configurable with the `max_request_failures` option) is reached.
- External failures are reported to the frontend immediately.
-
-**Worker failure** -- when a worker crash is detected, an attempt to reassign
- its current job and also all the jobs from its queue is made. Because the
- current job might be the reason of the crash, its reassignment is also counted
- towards the `max_request_failures` limit (the counter is shared). If there is
- no worker that could process a job available (i.e. it cannot be reassigned),
- the job is reported as failed to the frontend via REST API.
-
-**Broker failure** -- when the broker itself crashed and is restarted, workers
- will reconnect automatically. However, all jobs in their queues are lost. If a
- worker manages to finish a job and notifies the "new" broker, the report is
- forwarded to the frontend. The same goes for external failures. Jobs that fail
- internally cannot be reassigned, because the "new" broker does not know their
- headers -- they are reported as failed immediately.
-
-### Additional Libraries
-
-Broker implementation depends on several open-source C and C++ libraries.
-
-- **libcurl** -- Libcurl is used for notifying REST API on job finish event over
-  HTTP protocol. Due to lack of documentation of all C++ bindings the plain C
-  API is used.
-- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
-  basicaly contains only one header file, but its API fits into the object
-  architecture of the broker.
-- **spdlog** -- Spdlog is small, fast and modern logging library used for system
-  logging. It is highly customizable and configurable from the
-  configuration of the broker.
-- **yaml-cpp** -- Yaml-cpp is used for parsing broker configuration text file in
-  YAML format.
-- **boost-filesystem** -- Boost filesystem is used for managing logging
-  directory (create if necessary) and parsing filesystem paths from strings as
-  written in the configuration of the broker. Filesystem operations will be
-  included in future releases of C++ standard, so this dependency may be
-  removed in the future.
-- **boost-program_options** -- Boost program options is used for
-  parsing of command line positional arguments. It is possible to use POSIX
-  `getopt` C function, but we decided to use boost, which provides nicer API and
-  is already used by worker component.
-
-
-## Fileserver
-
-The fileserver component provides a shared file storage between the frontend and
-the backend. It is written in Python 3 using Flask web framework. Fileserver
-stores files in configurable filesystem directory, provides file deduplication
-and HTTP access. To keep the stored data safe, the fileserver should not be
-visible from public internet. Instead, it should be accessed indirectly through
-the REST API.
-
-### File Deduplication
-
-From our analysis of the requirements, it is certain we need to implement a
-means of dealing with duplicate files.
-
-File deduplication is implemented by storing files under the hashes of their
-content. This procedure is done completely inside fileserver. Plain files are
-uploaded into fileserver, hashed, saved and the new filename is returned back to
-the uploader.
-
-SHA1 is used as hashing function, because it is fast to compute and provides
-reasonable collision safety for non-cryptographic purposes. Files with the same
-hash are treated as the same, no additional checks for collisions are performed.
-However, it is really unlikely to find one. If SHA1 proves insufficient, it is
-possible to change the hash function to something else, because the naming
-strategy is fully contained in the fileserver (special care must be taken to
-maintain backward compatibility).
-
-### Storage Structure
-
-Fileserver stores its data in following structure:
-
-- `./submissions/<id>/` -- folder that contains files submitted by users (the
-  solutions to the assignments of the student). `<id>` is an identifier received
-  from the REST API.
-- `./submission_archives/<id>.zip` -- ZIP archives of all submissions. These are
-  created automatically when a submission is uploaded. `<id>` is an identifier
-  of the corresponding submission.
-- `./exercises/<subkey>/<key>` -- supplementary exercise files (e.g. test inputs
-  and outputs). `<key>` is a hash of the file content (`sha1` is used) and
-  `<subkey>` is its first letter (this is an attempt to prevent creating a flat
-  directory structure).
-- `./results/<id>.zip` -- ZIP archives of results for submission with `<id>`
-  identifier.
-
-
-## Worker
-
-The job of the worker is to securely execute a job according to its
-configuration and upload results back for latter processing. After receiving an
-evaluation request, worker has to do following:
-
-- download the archive containing submitted source files and configuration file
-- download any supplementary files based on the configuration file, such as test 
-  inputs or helper programs (this is done on demand, using a `fetch` command
-  in the assignment configuration)
-- evaluate the submission according to job configuration
-- during evaluation progress messages can be sent back to broker
-- upload the results of the evaluation to the fileserver
-- notify broker that the evaluation finished
-
-### Internal Structure
-
-Worker is logically divided into two parts:
-
-- **Listener** -- communicates with broker through ZeroMQ. On startup, it
-  introduces itself to the broker. Then it receives new jobs, passes them to
-  the evaluator part and sends back results and progress reports.
-- **Evaluator** -- gets jobs from the listener part, evaluates them (possibly in
-  sandbox) and notifies the other part when the evaluation ends. Evaluator also
-  communicates with fileserver, downloads supplementary files and uploads
-  detailed results.
-
-These parts run in separate threads of the same process and communicate through
-ZeroMQ in-process sockets. Alternative approach would be using shared memory
-region with unique access, but messaging is generally considered safer. Shared
-memory has to be used very carefully because of race condition issues when
-reading and writing concurrently. Also, messages inside worker are small, so
-there is no big overhead copying data between threads. This multi-threaded
-design allows the worker to keep sending `ping` messages even when it is
-processing a job.
-
-### Capability Identification
-
-There are possibly multiple worker instances in a ReCodEx installation and each
-one can run on different hardware, operating system, or have different tools
-installed. To identify the hardware capabilities of a worker, we use the concept
-of **hardware groups**. Each worker belongs to exactly one group that specifies
-the hardware and operating system on which the submitted programs will be run. A
-worker also has a set of additional properties called **headers**. Together they
-help the broker to decide which worker is suitable for processing a job
-evaluation request. This information is sent to the broker on worker startup.
-
-The hardware group is a string identifier of the hardware configuration, for
-example "i7-4560-quad-ssd-linux" configured by the administrator for each worker
-instance. If this is done correctly, performance measurements of a submission
-should yield the same results on all computers from the same hardware group.
-Thanks to this fact, we can use the same resource limits on every worker in a
-hardware group.
-
-The headers are a set of key-value pairs that describe the worker capabilities.
-For example, they can show which runtime environments are installed or whether
-this worker measures time precisely. Headers are also configured manually by an
-administrator.
-
-### Running Student Submissions
-
-Student submissions are executed in a sandbox environment to prevent them from
-damaging the host system and also to restrict the amount of used resources.
-Currently, only the Isolate sandbox support is implemented, but it is possible
-to add support for another sandbox.
-
-Every sandbox, regardless of the concrete implementation, has to be a command
-line application taking parameters with arguments, standard input or file.
-Outputs should be written to a file or to the standard output. There are no
-other requirements, the design of the worker is very versatile and can be
-adapted to different needs.
-
-The sandbox part of the worker is the only one which is not portable, so
-conditional compilation is used to include only supported parts of the project.
-Isolate does not work on Windows environment, so also its invocation is done
-through native calls of Linux OS (`fork`, `exec`). To disable compilation of
-this part on Windows, the `#ifndef _WIN32` guard is used around affected files.
-
-Isolate in particular is executed in a separate Linux process created by `fork`
-and `exec` system calls. Communication between processes is performed through an
-unnamed pipe with standard input and output descriptors redirection. To prevent
-Isolate failure there is another safety guard -- whole sandbox is killed when it
-does not end in `(time + 300) * 1.2` seconds where `time` is the original
-maximum time allowed for the task. This formula works well both for short and
-long tasks, but the timeout should never be reached if Isolate works properly --
-it should always end itself in time.
-
-### Directories and Files
-
-During a job execution the worker has to handle several files -- input archive
-with submitted sources and job configuration, temporary files generated during
-execution or fetched testing inputs and outputs. For each job is created a
-separate directory structure which is removed after finishing the job.
-
-The files are stored in local filesystem of the worker computer in a
-configurable location. The job is not restricted to use only specified
-directories (tasks can do anything that is allowed by the system), but it is
-advised not to write outside them. In addition, sandboxed tasks are usually
-restricted to use only a specific (evaluation) directory.
-
-The following directory structure is used for execution. The working directory
-of the worker (root of the following paths) is shared for multiple instances on
-the same computer.
-
-- `downloads/${WORKER_ID}/${JOB_ID}` -- place to store the downloaded archive
-  with submitted sources and job configuration
-- `submission/${WORKER_ID}/${JOB_ID}` -- place to store a decompressed
-  submission archive
-- `eval/${WORKER_ID}/${JOB_ID}` -- place where all the execution should happen
-- `temp/${WORKER_ID}/${JOB_ID}` -- place for temporary files
-- `results/${WORKER_ID}/${JOB_ID}` -- place to store all files which will be
-  uploaded on the fileserver, usually only yaml result file and optionally log
-  file, other files have to be explicitly copied here if requested
-
-Some of the directories are accessible during job execution from within sandbox
-through predefined variables. List of these is described in job configuration
-appendix.
-
-### Judges
-
-ReCodEx provides a few initial judges programs. They are mostly adopted from
-CodEx and installed automatically with the worker component. Judging programs
-have to meet some requirements. Basic ones are inspired by standard `diff`
-application -- two mandatory positional parameters which have to be the files
-for comparison and exit code reflecting if the result is correct (0) of wrong
-(1).
-
-This interface lacks support for returning additional data by the judges, for
-example similarity of the two files calculated as the Levenshtein edit distance.
-To allow passing these additional values an extended judge interface can be
-implemented:
-
-- Parameters: There are two mandatory positional parameters which have to be
-  files for comparison
-- Results:
-    - _comparison OK_
-		- exitcode: 0
-		- stdout: there is a single line with a double value which
-		  should be 1.0
-    - _comparison BAD_
-		- exitcode: 1
-		- stdout: there is a single line with a double value which
-		  should be quality percentage of the judged file
-	- _error during execution_
-		- exitcode: 2
-		- stderr: there should be description of error
-
-The additional double value is saved to the results file and can be used for
-score calculation in the frontend. If just the basic judge is used, the values
-are 1.0 for exit code 0 and 0.0 for exit code 1.
-
-If more values are needed for score computation, multiple judges can be used in
-sequence and the values used together. However, extended judge interface should
-comply most of possible use cases.
-
-### Additional Libraries
-
-Worker implementation depends on several open-source C and C++ libraries. All of
-them are multi-platform, so both Linux and Windows builds are possible.
-
-- **libcurl** -- Libcurl is used for all HTTP communication, that is downloading
-  and uploading files. Due to lack of documentation of all C++ bindings the
-  plain C API is used.
-- **libarchive** -- Libarchive is used for compressing and extracting archives.
-  Actual supported formats depends on installed packages on target system, but
-  at least ZIP and TAR.GZ should be available.
-- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
-  basicaly contains only one header file, but its API fits into the object
-  architecture of the worker.
-- **spdlog** -- Spdlog is small, fast and modern logging library. It is used for
-  all of the logging, both system and job logs. It is highly customizable and
-  configurable from the configuration of the worker.
-- **yaml-cpp** -- Yaml-cpp is used for parsing and creating text files in YAML
-  format. That includes the configuration of the worker, the configuration and
-  the results of a job.
-- **boost-filesystem** -- Boost filesystem is used for multi-platform
-  manipulation with files and directories. However, these operations will be
-  included in future releases of C++ standard, so this dependency may be removed
-  in the future.
-- **boost-program_options** -- Boost program options is used for multi-platform
-  parsing of command line positional arguments. It is not necessary to use it,
-  similar functionality can be implemented be ourselves, but this well known
-  library is effortless to use.
-
-## Monitor
-
-Monitor is an optional part of the ReCodEx solution for reporting progress of
-job evaluation back to users in the real time. It is written in Python, tested
-versions are 3.4 and 3.5. Following dependencies are used:
-
-- **zmq** -- binding to ZeroMQ message framework
-- **websockets** -- framework for communication over WebSockets
-- **asyncio** -- library for fast asynchronous operations
-- **pyyaml** -- parsing YAML configuration files
-
-There is just one monitor instance required per broker. Also, monitor has to be
-publicly visible (has to have public IP address or be behind public proxy
-server) and also needs a connection to the broker. If the web application is
-using HTTPS, it is required to use a proxy for monitor to provide encryption
-over WebSockets. If this is not done, browsers of the users will block
-unencrypted connection and will not show the progress to the users.
-
-### Message Flow
-
-![Message flow inside monitor](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
-
-Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all
-components (logger for example), starts the other thread and runs the ZeroMQ
-part of the application. This thread receives and parses incoming messages from
-broker and forwards them to _thread 2_ sending logic.
-
-_Thread 2_ is responsible for managing all of WebSocket connections
-asynchronously. Whole thread is one big _asyncio_ event loop through which all
-actions are processed. None of custom data types in Python are thread-safe, so
-all events from other threads (actually only `send_message` method invocation)
-must be called within the event loop (via `asyncio.loop.call_soon_threadsafe`
-function). Please note, that most of the Python interpreters use [Global
-Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there
-is actually no parallelism in the performance point of view, but proper
-synchronization is still required.
-
-### Handling of Incoming Messages
-
-Incoming ZeroMQ progress message is received and parsed to JSON format (same as
-our WebSocket communication format). JSON string is then passed to _thread 2_
-for asynchronous sending. Each message has an identifier of channel where to
-send it to.
-
-There can be multiple receivers to one channel id. Each one has separate
-_asyncio.Queue_ instance where new messages are added. In addition to that,
-there is one list of all messages per channel. If a client connects a bit later
-than the point when monitor starts to receive messages, it will receive all
-messages from the beginning. Messages are stored 5 minutes after last progress
-command (normally FINISHED) is received, then are permanently deleted. This
-caching mechanism was implemented because early testing shows, that first couple
-of messages are missed quite often.
-
-Messages from the queue of the client are sent through corresponding WebSocket
-connection via main event loop as soon as possible. This approach with separate
-queue per connection is easy to implement and guarantees reliability and order
-of message delivery.
-
-## Cleaner
-
-Cleaner component is tightly bound to the worker. It manages the cache folder of
-the worker, mainly deletes outdated files. Every cleaner instance maintains one
-cache folder, which can be used by multiple workers. This means on one server
-there can be numerous instances of workers with the same cache folder, but there
-should be only one cleaner instance.
-
-Cleaner is written in Python 3 programming language, so it works well
-multi-platform. It uses only `pyyaml` library for reading configuration file and
-`argparse` library for processing command line arguments.
-
-It is a simple script which checks the cache folder, possibly deletes old files
-and then ends. This means that the cleaner has to be run repeatedly, for example
-using cron, systemd timer or Windows task scheduler. For proper function of the
-cleaner a suitable cronning interval has to be used. It is recommended to use
-24 hour interval which is sufficient enough for intended usage. The value is set
-in the configuration file of the cleaner.
-
-## REST API
-
-The REST API is a PHP application run in an HTTP server. Its purpose is
-providing controlled access to the evaluation backend and storing the state of
-the application.
-
-### Used Technologies
-
-We chose to use PHP in version 7.0, which was the most recent version at the
-time of starting the project. The most notable new feature is optional static
-typing of function parameters and return values. We use this as much as possible
-to enable easy static analysis with tools like PHPStan. Using static analysis
-leads to less error-prone code that does not need as many tests as code that
-uses duck typing and relies on automatic type conversions. We aim to keep our
-codebase compatible with new releases of PHP.
-
-To speed up the development and to make it easier to follow best practices, we
-decided to use the Nette framework. The framework itself is focused on creating
-applications that render HTML output, but a lot of its features can be used in a
-REST application, too. 
-
-Doctrine 2 ORM is used to provide a layer of abstraction over storing objects in
-a database. This framework also makes it possible to change the database server.
-The current implementation uses MariaDB, an open-source fork of MySQL.
-
-To communicate with the evaluation backend, we need to use ZeroMQ. This
-functionality is provided by the `php_zmq` plugin that is shipped with most PHP
-distributions.
-
-### Data model
-
-We decided to use a code-first approach when designing our data model. This
-approach is greatly aided by the Doctrine 2 ORM framework, which works with
-entities -- PHP classes for which we specify which attributes should be
-persisted in a database. The database schema is generated from the entity
-classes. This way, the exact details of how our data is stored is a secondary
-concern for us and we can focus on the implementation of the business logic
-instead.
-
-The rest of this section is a description of our data model and how it relates
-to the real world. All entities are stored in the `App\Model\Entity` namespace.
-There are repository classes that are used to work with entities without calling
-the Doctrine `EntityManager` directly. These are in the `App\Model\Repository`
-namespace.
-
-#### User Account Management
-
-The `User` entity class contains data about users registered in ReCodEx. To
-allow extending the system with additional authentication methods, login details
-are stored in separate entities. There is the `Login` entity class which
-contains a user name and password for our internal authentication system, and
-the `ExternalLogin` entity class, which contains an identifier for an external
-login service such as LDAP. Currently, each user can only have a single
-authentication method (account type). The entity with login information is
-created along with the `User` entity when a user signs up. If a user requests a
-password reset, a `ForgottenPassword` entity is created for the request.
-
-A user needs a way to adjust settings such as their preferred language or theme.
-This is the purpose of the `UserSettings` entity class. Each possible option has
-its own attribute (database column). Current supported options are `darkTheme`,
-`defaultLanguage` and `vimMode`
-
-Every user has a role in the system. The basic ones are student, supervisor and
-administrator, but new roles can be created by adding `Role` entities. Roles can
-have permissions associated with them. These associations are represented by
-`Permission` entities. Each permission consists of a role, resource, action and
-an `isAllowed` flag. If the `isAllowed` flag is set to true, the permission is
-positive (lets the role access the resource), and if it is false, it denies
-access. The `Resource` entity contains just a string identifier of a resource
-(e.g., group, user, exercise). Action is another string that describes what the
-permission allows or denies for the role and resource (e.g., edit, delete,
-view). 
-
-The `Role` entity can be associated with a parent entity. If this is the case,
-the role inherits all the permissions of its parent.
-
-All actions done by a user are logged using the `UserAction` entity for
-debugging purposes.
-
-#### Instances and Groups
-
-Users of ReCodEx are divided into groups that correspond to school lab groups
-for a single course. Each group has a textual name and description. It can have
-a parent group so that it is possible to create tree hierarchies of groups.
-
-Group membership is realized using the `GroupMembership` entity class. It is a
-joining entity for the `Group` and `User` entities, but it also contains
-additional information, most importantly `type`, which helps to distinguish
-students from group supervisors.
-
-Groups are organized into instances. Every `Instance` entity corresponds to an
-organization that uses the ReCodEx installation, for example a university or a
-company that organizes programming workshops. Every user and group belong to
-exactly one instance (users choose an instance when they create their account).
-
-Every instance can be associated with multiple `Licence` entities. Licences are
-used to determine whether an instance can be currently used (access to those
-without a valid instance will be denied). They can correspond to billing periods
-if needed.
-
-#### Exercises
-
-The `Exercise` entity class is used to represent exercises -- programming tasks
-that can be assigned to student groups. It contains data that does not relate to
-this "concrete" assignment, such as the name, version and a private description.
-
-Some exercise descriptions need to be translated into multiple languages.
-Because of this, the `Exercise` entity is associated with the
-`LocalizedText` entity, one for each translation of the text. 
-
-An exercise can support multiple programming runtime environments. These
-environments are represented by `RuntimeEnvironment` entities. Apart from a name
-and description, they contain details of the language and operating system that
-is being used. There is also a list of extensions that is used for detecting
-which environment should be used for student submissions.
-
-`RuntimeEnvironment` entities are not linked directly to exercises. Instead,
-the `Exercise` entity has an M:N relation with the `RuntimeConfig` entity,
-which is associated with `RuntimeEnvironment`. It also contains a path to a job
-configuration file template that will be used to create a job configuration file
-for the worker that processes solutions of the exercise.
-
-Resource limits are stored outside the database, in the job configuration file
-template.
-
-#### Reference Solutions
-
-To make setting resource limits objectively possible for a potentially diverse
-set of worker machines, there should be multiple reference solutions for every
-exercise in all supported languages that can be used to measure resource usage
-of different approaches to the problem on various hardware and platforms.
-
-Reference solutions are contained in `ReferenceSolution` entities. These
-entities can have multiple `ReferenceSolutionEvaluation` entities associated
-with them that link to evaluation results (`SolutionEvaluation` entity). Details
-of this structure will be described in the section about student solutions.
-
-Source codes of the reference solutions can be accessed using the `Solution`
-entity associated with `ReferenceSolution`. This entity is also used for student
-submissions.
-
-#### Assignments
-
-The `Assignment` entity is created from an `Exercise` entity when an exercise is
-assigned to a group. Most details of the exercise can be overwritten (see the
-reference documentation for a detailed overview). Additional information such as
-deadlines or point values for individual tests is also configured for the
-assignment and not for an exercise.
-
-Assignments can also have their own `LocalizedText` entities. If the
-assignment texts are not changed, they are shared between the exercise and its
-assignment.
-
-Runtime configurations can be also changed for the assignment. This way, a
-supervisor can for example alter the resource limits for the tests. They could
-also alter the way submissions are evaluated, which is discouraged.
-
-#### Student Solutions
-
-Solutions submitted by students are represented by the `Submission` entity. It
-contains data such as when and by whom was the solution submitted. There is also
-a timestamp, a note for the supervisor and an url of the location where
-evaluation results should be stored. 
-
-However, the most important part of a submission are the source files. These are
-stored using the `SolutionFile` entity and they can be accessed through the
-`Solution` entity, which is associated with `Submission`.
-
-When the evaluation is finished, the results are stored using the
-`SolutionEvaluation` entity. This entity can have multiple `TestResult` entities
-associated with it, which describe the result of a test and also contain
-additional information for failing tests (such as which limits were exceeded).
-Every `TestResult` can contain multiple `TaskResult` entities that provide
-details about the results of individual tasks. This reflects the fact that
-"tests" are just logical groups of tasks.
-
-#### Comment Threads
-
-The `Comment` entity contains the author of the comment, a date and the text of
-the comment. In addition to this, there is a `CommentThread` entity associated
-with it that groups comments on a single entity (such as a student submission).
-This enables easily adding support for comments to various entities -- it is
-enough to add an association with the `CommentThread` entity. An even simpler
-way is to just use the identifier of the commented entity as the identifier of
-the comment thread, which is how submission comments are implemented.
-
-#### Uploaded Files
-
-Uploaded files are stored directly on the filesystem instead of in the database.
-The `UploadedFile` entity is used to store their metadata. This entity is
-extended by `SolutionFile` and `ExerciseFile` using the Single Table Inheritance
-pattern provided by Doctrine. Thanks to this, we can access all files uploaded
-on the API through the same repository while also having data related to e.g.,
-supplementary exercise files present only in related objects.
-
-### Request Handling
-
-A typical scenario for handling an API request is matching the HTTP request with
-a corresponding handler routine which creates a response object, that is then
-sent back to the client, encoded with JSON. The `Nette\Application` package can
-be used to achieve this with Nette, although it is meant to be used mainly in
-MVP applications.
-
-Matching HTTP requests with handlers can be done using standard Nette URL
-routing -- we will create a Nette route for each API endpoint. Using the routing
-mechanism from Nette logically leads to implementing handler routines as Nette
-Presenter actions. Each presenter should serve logically related endpoints.
-
-The last step is encoding the response as JSON. In `Nette\Application`, HTTP
-responses are returned using the `Presenter::sendResponse()` method. We decided
-to write a method that calls `sendResponse` internally and takes care of the
-encoding. This method has to be called in every presenter action. An alternative
-approach would be using the internal payload object of the presenter, which is
-more convenient, but provides us with less control.
-
-### Authentication
-
-Instead of relying on PHP sessions, we decided to use an authentication flow
-based on JWT tokens (RFC 7519). On successful login, the user is issued an
-access token that they have to send with subsequent requests using the HTTP
-Authorization header (Authorization: Bearer <token>). The token has a limited
-validity period and has to be renewed periodically using a dedicated API
-endpoint.
-
-To implement this behavior in Nette framework, a new IUserStorage implementation
-was created (`App\Security\UserStorage`), along with an IIdentity and
-authenticators for both our internal login service and CAS. The authenticators
-are not registered in the DI container, they are invoked directly instead. On
-successful authentication, the returned `App\Security\Identity` object is stored
-using the `Nette\Security\User::login()` method. The user storage service works
-with the http request to extract the access token if possible.
-
-The logic of issuing tokens is contained in the `App\Security\AccessManager`
-class. Internally, it uses the Firebase JWT library.
-
-The authentication flow is contained in the `LoginPresenter` class, which serves
-the `/login` endpoint group.
-
-An advantage of this approach is being able control the authentication process
-completely instead of just receiving session data through a global variable.
-
-### Accessing Endpoints
-
-The REST API has a [generated documentation](https://recodex.github.io/api/)
-describing detailed format of input values as well as response structures
-including samples.
-
-Knowing the exact format of the endpoints allows interacting directly with the
-API using any REST client available, for example `curl` or `Postman` Chrome
-extension. However, there is a generated [REST
-client](https://recodex.github.io/api/ui.html) directly for the ReCodEx API
-structure using Swagger UI tool. For each endpoint there is a form with boxes
-for all the input parameters including description and data type. The responses
-are shown as highlighted JSON. The authorization can be set for whole session at
-once using "Authorize" button at the top of the page.
-
-### Permissions
-
-In a system storing user data has to be implemented some kind of permission
-checking. Each user has a role, which corresponds to his/her privileges.
-Our research showed, that three roles are sufficient -- student, supervisor
-and administrator. The user role has to be
-checked with every request. The good points is, that roles nicely match with
-granularity of API endpoints, so the permission checking can be done at the
-beginning of each request. That is implemented using PHP annotations, which
-allows to specify allowed user roles for each request with very little of code,
-but all the business logic is the same, together in one place.
-
-However, roles cannot cover all cases. For example, if user is a supervisor, it
-relates only to groups, where he/she is a supervisor. But using only roles
-allows him/her to act as supervisor in all groups in the system. Unfortunately,
-this cannot be easily fixed using some annotations, because there are many
-different cases when this problem occurs. To fix that, some additional checks
-can be performed at the beginning of request processing. Usually it is only one
-or two simple conditions.
-
-With this two concepts together it is possible to easily cover all cases of
-permission checking with quite a small amount of code.
-
-### Uploading Files
-
-There are two cases when users need to upload files using the API -- submitting
-solutions to an assignment and creating a new exercise. In both of these cases,
-the final destination of the files is the fileserver. However, the fileserver is
-not publicly accessible, so the files have to be uploaded through the API.
-
-Each file is uploaded separately and is given a unique ID. The uploaded file
-can then be attached to an exercise or a submitted solution of an exercise.
-Storing and removing files from the server is done through the 
-`App\Helpers\UploadedFileStorage` class which maps the files to their records
-in the database using the `App\Model\Entity\UploadedFile` entity.
-
-### Forgotten Password
-
-When user finds out that he/she does not remember a password, he/she requests a
-password reset and fills in his/her unique email. A temporary access token is
-generated for the user corresponding to the given email address and sent to this
-address encoded in a URL leading to a client application. User then goes
-to the URL and can choose a new password.
-
-The temporary token is generated and emailed by the
-`App\Helpers\ForgottenPasswordHelper` class which is registered as a service
-and can be injected into any presenter.
-
-This solution is quite safe and user can handle it on its own, so administrator
-does not have to worry about it.
-
-### Job Configuration Parsing and Modifying
-
-Even in the API the job configuration file can be loaded in the corresponding
-internal structures. This is necessary because there has to be possibility to
-modify particular job details, such as the job identification or the fileserver
-address, during the submission.
-
-Whole codebase concerning the job configuration is present in the
-`App\Helpers\JobConfig` namespace. Job configuration is represented by the
-`JobConfig` class which directly contains structures like `SubmissionHeader` or
-`Tasks\Task` and indirectly `SandboxConfig` or `JobId` and more. All these
-classes have parameterless constructor which should set all values to their
-defaults or construct appropriate classes.
-
-Modifying of values in the configuration classes is possible through *fluent
-interfaces* and *setters*. Getting of values is also possible and all setters
-should have *get* counterparts. Job configuration is serialized through
-`__toString()` methods.
-
-For loading of the job configuration there is separate `Storage` class which can
-be used for loading, saving or archiving of job configuration. For parsing the
-storage uses the `Loader` class which does all the checks and loads the data
-from given strings in the appropriate structures. In case of parser error
-`App\Exceptions\JobConfigLoadingException` is thrown.
-
-Worth mentioning is also `App\Helpers\UploadedJobConfigStorage` class which
-takes care of where the uploaded job configuration files should be saved on the
-API filesystem. It can also be used for copying all job configurations during
-assignment of exercise.
-
-### Solution Loading
-
-When a solution evaluation is finished by the backend, the results are saved to
-the fileserver and the API is notified by the broker. The results are parsed and
-stored in the database.
-
-For the results of the evaluations of the reference solutions and for the
-asynchronously evaluated solutions of the students (e.g., resubmitted by
-the administrator) the result is processed
-right after the notification from backend is received and the author of the
-solution will be notified by an email after the results are processed.
-
-When a student submits his/her solution directly through the client application
-we do not parse the results right away but we postpone this until the student
-(or a supervisor) wants to display the results for the first time. This may save
-save some resources when the solution results are not important (e.g., the
-student finds a bug in his solution before the submission has been evaluated).
-
-#### Parsing of The Results
-
-The results are stored in a YAML file. We map the contents of the file to the
-classes of the `App\Helpers\EvaluationResults` namespace. This process
-validates the file and gives us access to all of the information through
-an interface of a class and not only using associative arrays. This is very
-similar to how the job configuration files are processed.
-
-
-## Web Application
-
-The whole project is written using the next generation of JavaScript referred to
-as *ECMAScript 6* (also known as *ES6*, *ES.next*, or *Harmony*). Since not all
-of the features introduced in this standard are implemented in the modern
-web browsers of today (like classes and the spread operator) and hardly any are
-implemented in the older versions of the web browsers which are currently still
-in use, the source code is transpiled into the older standard *ES5* using
-[Babel.js](https://babeljs.io/) transpiler and bundled into a single script file
-using the [webpack](https://webpack.github.io/) moudle bundler. The need for a
-transpiler also arises from the usage of the *JSX* syntax for declaring React
-components. To read more about these these tools and their usage please refer to
-the [installation documentation](#Installation). The whole bundling process
-takes place at deployment and is not repeated afterwards when running in
-production.
-
-### State Management
-
-Web application is a SPA (Single Page Application. When the user accesses the
-page, the source codes are downloaded and are interpreted by the web browser.
-The communication between the browser and the server then runs in the background
-without reloading the page.
-
-The application keeps its internal state which can be altered by the actions of
-the user (e.g., clicking on links and buttons, filling input fields of forms)
-and by the outcomes of HTTP requests to the API server. This internal state is
-kept in memory of the web browser and is not persisted in any way -- when the
-page is refreshed, the internal state is deleted and a new one is created from
-scratch (i.e., all of the data is fetched from the API server again).
-
-The only part of the state which is persisted is the token of the logged in
-user. This token is kept in cookies and in the local storage. Keeping the token
-in the cookies is necessary for server-side rendering.
-
-#### Redux
-
-The in-memory state is handled by the *redux* library. This library is strongly
-inspired by the [Flux](https://facebook.github.io/flux/) architecture but it has
-some specifics.  The whole state is in a single serializable tree structure
-called the *store*. This store can be modified only by dispatching *actions*
-which are Plain Old JavaScript Oblects (POJO) which are processed by *reducers*.
-A reducer is a pure function which takes the state object and the action object
-and it creates a new state. This process is very easy to reason about and is
-also very easy to test using unit tests. Please read the [redux
-documentation](http://redux.js.org/) for detailed information about the library.
-
-![Redux state handling schema](https://github.com/ReCodEx/wiki/raw/master/images/redux.png)
-
-The main difference between *Flux* and *redux* is the fact that there is only
-one store with one reducer in redux. The single reducer might be composed from
-several silmple reducers which might be composed from other simple reducers as
-well, therefore the single reducer of the store is often refered to as the root
-reducer. Each of the simple reducers receives all the dispatched actions and it
-decide which actions it will process and which it will ignore based on the
-*type* of the action. The simple reducers can change only a specific subtree of
- the whole state tree and these subtrees do not overlap.
-
-##### Redux Middleware
-
-A middleware in redux is a function which can process actions before they are
-passed to the reducers to update the state. 
-
-The middleware used by the ReCodEx store is defined in the `src/redux/store.js`
-script.  Several open source libraries are used:
-
-- [redux-promise-middleware](https://github.com/pburtchaell/redux-promise-middleware)
-- [redux-thunk](https://github.com/gaearon/redux-thunk)
-- [react-router-redux](https://github.com/reactjs/react-router-redux)
-
-We created two other custom middleware functions for our needs:
-
-- **API middleware** -- The middleware filters out all actions with the *type*
-  set to `recodex-api/CALL`, sends a real HTTP request according to the
-  information in the action.
-- **Access Token Middleware** -- This middleware persists the access token each
-  time after the user signs into the application into the local storage and the
-  cookies. The token is removed when the user decides to sign out. The
-  middleware also attaches the token to each `recodex-api/CALL` action when it
-  does not have an access token set explicitly.
-
-##### Accessing The Store Using Selectors
-
-The components of the application are connected to the redux store using a
-higher order function `connect` from the *react-redux* binding library. This 
-connection ensures that the react components will re-render every time some of 
-the specified subtrees of the main state changes.
-
-The specific subtrees of interest are defined for every connection. These
-definitions called *selectors* and they are are simple pure functions which take
-the state and return its subtree. To avoid unnecesary re-renders and selections 
-a small library called [reselect](https://github.com/reactjs/reselect) is used. 
-This library allows us to compose the selectors in a similar way the reducers
-are composed and therefore simply reflect the structure of the whole state
-tree. The selectors for each reducer are stored in a separate file in the
-`src/redux/selectors` directory.
-
-#### Routing
-
-The page should not be reloaded after the initial render but the current
-location of the user in the system must be reflected in the URL. This is
-achieved through the
-[react-router](https://github.com/ReactTraining/react-router) and
-[react-router-redux](https://github.com/reactjs/react-router-redux) libraries.
-These libraries use `pushState` method of the `history` object, a living
-standard supported by all of the modern browsers.  The mapping of the URLs to
-the components is defined in the `src/pages/routes.js` file.  To create links
-between pages, use either the `Link` component from the `react-router` library
-or dispatch an action created using the `push` action creator from the
-`react-router-redux` library. All the navigations are mapped to redux actions
-and can be handled by any reducer.
-
-Having up-to-date URLs gives the users the possibility to reload the page if
-some error occurs on the page and land at the same page as he or she would
-expect. Users can also send links to the very page they want to.
-
-### Creating HTTP Requests
-
-All of the HTTP requests are made by dispatching a specific action which will be
-processed by our custom *API middleware*. The action must have the *type*
-property set to `recodex-api/CALL`. The middleware catches the action and it
-sends a real HTTP request created according to the information in the `request`
-property of the action:
-
-- **type** -- Type prefix of the actions which will be dispatched automatically
-  during the lifecycle of the request (pending, fulfilled, failed).
-- **endpoint** -- The URI to which the request should be sent. All endpoints
-  will be prefixed with the base URL of the API server.
-- **method** (*optional*) -- A string containing the name of the HTTP method
-  which should be used.  The default method is `GET`.
-- **query** (*optional*) -- An object containing key-value pairs which will be
-  put in the URL of the request in the query part of the URL.
-- **headers** (*optional*) -- An object containing key-value pairs which will be
-  appended to the headers of the HTTP request.
-- **accessToken** (*optional*) -- Explicitly set the access token for the
-  request. The token will be put in the *Authorization* header.
-- **body** (*optional*) -- An object or an array which will be recursively
-  flattened into the `FormData` structure with correct usage of square brackets
-  for nested (associative) arrays. It is worth mentioning that the keys must not
-  contain a colon in the string.
-- **doNotProcess** (*optional*) -- A boolean value which can disable the default
-  processing of the response to the request which includes showing a
-  notification to the user in case of a failure of the request. All requests
-  are processed in the way described above by default.
-
-The HTTP requests are sent using the `fetch` API which returns a *Promise* of
-the request.  This promise is put into a new action and creates a new action
-containing the promise and the type specified in the `request` description. This
-action is then caught by the promise middleware and the promise middleware
-dispatches actions whenever the state of the promise changes during its the
-lifecycle. The new actions have specific types:
-
-- {$TYPE}_PENDING -- Dispatched immediately after the action is processed by
-  the promise middleware. The `payload` property of the action contains the body
-  of the request.
-- {$TYPE}_FAILED -- Dispatched if the promise of the request is rejected.
-- {$TYPE}_FULFILLED -- Dispatched when the response to the request is received
-  and the promise is resolved. The `payload` property of the action contains the
-  body of the HTTP response parsed as JSON.
-
-### Routine CRUD Operations
-
-For routine CRUD (Create, Read, Update, Delete) operations which are common to
-most of the resources used in the ReCodEx (e.g., groups, users, assignments,
-solutions, solution evaluations, source code files) a set of functions called
-*Resource manager* was implemented. It contains a factory which creates basic
- actions (e.g., `fetchResource`, `addResource`, `updateResource`,
- `removeResource`, `fetchMany`) and handlers for all of the lifecycle actions
- created by both the API middleware and the promise middleware which can be used
- to create a basic reducer.
-
-The *resource manager* is spread over several files in the
-`src/redux/helpers/resourceManager` directory and is covered with unit tests in
-scripts located at `test/redux/helpers/resourceManager`.
-
-### Server-side Rendering
-
-To speed-up the initial time of rendering of the web application a technique
-called server-side rendering (SSR) is used. The same code which is executed in
-the web browser of the client can run on the server using
-[Node.js](https://nodejs.org). React can serialize its HTML output into a string
-which can be sent to the client and can be displayed before the (potentially
-large) JavaScript source code starts being executed by the browser. The redux
-store is in fact just a large JSON tree which can be easily serialized as well.
-
-If the user is logged in then the access token should be in the cookies of the
-web browser and it should be attached to the HTTP request when the user
-navigates to the ReCodEx web page. This token is then put into the redux store
-and so the user is logged in on the server.
-
-The whole logic of the SSR is in a single file called `src/server.js`. It
-contains only a definition of a simple HTTP server (using the
-[express](http://expressjs.com/) framework) and some necessary boilerplate of
-the routing library.
-
-All the components which are associated to the matched route can have a class
-property `loadAsync` which should contain a function returning a *Promise*. The
-SRR calls all these functions and delays the response of the HTTP server until
-all of the promises are resolved (or some of them fails).
-
-### Localization and globalization
-
-The whole application is prepared for localization and globalization. All of the
-translatable texts can be extracted from the user interface and translated 
-into several languages. The numbers, dates, and time values are also formatted 
-with respect to the selected language. The
-[react-intl](https://github.com/yahoo/react-intl) and
-[Moment.js](http://momentjs.com/) libraries are used to achieve this.
-All the strings can be extracted from the application using a command:
-
-```
-$ npm run exportStrings
-```
-
-This will create JSON files with the exported strings for the 'en' and 'cs'
-locale. If you want to export strings for more languages, you must edit the
-`/manageTranslations.js` script. The exported strings are placed in the
-`/src/locales` directory.
-
-## Communication Protocol
-
-Detailed communication inside the ReCodEx system is captured in the following 
-image and described in sections below. Red connections are through ZeroMQ 
-sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ 
-messages are sent as multipart with one string (command, option) per part, with 
-no empty frames (unless explicitly specified otherwise).
-
-![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
-
-
-### Broker - Worker Communication
-
-Broker acts as server when communicating with worker. Listening IP address and
-port are configurable, protocol family is TCP. Worker socket is of DEALER type,
-broker one is ROUTER type. Because of that, very first part of every (multipart)
-message from broker to worker must be target the socket identity of the worker
-(which is saved on its **init** command).
-
-#### Commands from Broker to Worker:
-
-- **eval** -- evaluate a job. Requires 3 message frames:
-    - `job_id` -- identifier of the job (in ASCII representation -- we avoid
-      endianness issues and also support alphabetic ids)
-    - `job_url` -- URL of the archive with job configuration and submitted
-      source code
-    - `result_url` -- URL where the results should be stored after evaluation
-- **intro** -- introduce yourself to the broker (with **init** command) -- this
-  is required when the broker loses track of the worker who sent the command.
-  Possible reasons for such event are e.g. that one of the communicating sides
-  shut down and restarted without the other side noticing.
-- **pong** -- reply to **ping** command, no arguments
-
-#### Commands from Worker to Broker:
-
-- **init** -- introduce self to the broker. Useful on startup or after
-  reestablishing lost connection. Requires at least 2 arguments:
-    - `hwgroup` -- hardware group of this worker
-    - `header` -- additional header describing worker capabilities. Format must
-      be `header_name=value`, every header shall be in a separate message frame.
-      There is no limit on number of headers. There is also an optional third
-      argument -- additional information. If present, it should be separated
-      from the headers with an empty frame. The format is the same as headers.
-      Supported keys for additional information are:
-    - `description` -- a human readable description of the worker for
-      administrators (it will show up in broker logs)
-    - `current_job` -- an identifier of a job the worker is now processing. This
-      is useful when we are reassembling a connection to the broker and need it
-      to know the worker will not accept a new job.
-- **done** -- notifying of finished job. Contains following message frames:
-    - `job_id` -- identifier of finished job
-    - `result` -- response result, possible values are:
-	- OK -- evaluation finished successfully
-	- FAILED -- job failed and cannot be reassigned to another worker (e.g.
-	  due to error in configuration)
-	- INTERNAL_ERROR -- job failed due to internal worker error, but another
-	  worker might be able to process it (e.g. downloading a file failed)
-    - `message` -- a human readable error message
-- **progress** -- notice about current evaluation progress. Contains following
-  message frames:
-    - `job_id` -- identifier of current job
-    - `command` -- what is happening now. 
-	- DOWNLOADED -- submission successfully fetched from fileserver
-	- FAILED -- something bad happened and job was not executed at all
-	- UPLOADED -- results are uploaded to fileserver
-	- STARTED -- evaluation of tasks started
-	- ENDED -- evaluation of tasks is finished
-	- ABORTED -- evaluation of job encountered internal error, job will be
-	  rescheduled to another worker
-	- FINISHED -- whole execution is finished and worker ready for another
-	  job execution
-	- TASK -- task state changed -- see below
-    - `task_id` -- only present for "TASK" state -- identifier of task in
-      current job
-    - `task_state` -- only present for "TASK" state -- result of task
-      evaluation. One of:
-	- COMPLETED -- task was successfully executed without any error,
-	  subsequent task will be executed
-	- FAILED -- task ended up with some error, subsequent task will be
-	  skipped
-	- SKIPPED -- some of the previous dependencies failed to execute, so
-	  this task will not be executed at all
-- **ping** -- tell broker I am alive, no arguments
-
-
-#### Heartbeating
-
-It is important for the broker and workers to know if the other side is still 
-working (and connected). This is achieved with a simple heartbeating protocol.
-
-The protocol requires the workers to send a **ping** command regularly (the
-interval is configurable on both sides -- future releases might let the worker
-send its ping interval with the **init** command). Upon receiving a **ping**
-command, the broker responds with **pong**.
-
-Whenever a heartbeating message doesn't arrive, a counter called _liveness_ is 
-decreased. When this counter drops to zero, the other side is considered 
-disconnected. When a message arrives, the liveness counter is set back to its 
-maximum value, which is configurable for both sides.
-
-When the broker decides a worker disconnected, it tries to reschedule its jobs 
-to other workers.
-
-If a worker thinks the broker crashed, it tries to reconnect periodically, with 
-a bounded, exponentially increasing delay.
-
-This protocol proved great robustness in real world testing. Thus whole backend 
-is reliable and can outlive short term issues with connection without problems. 
-Also, increasing delay of ping messages does not flood the network when there 
-are problems. We experienced no issues since we are using this protocol.
-
-### Worker - Fileserver Communication
-
-Worker is communicating with file server only from _execution thread_. Supported 
-protocol is HTTP optionally with SSL encryption (**recommended**). If supported 
-by server and used version of libcurl, HTTP/2 standard is also available. File 
-server should be set up to require basic HTTP authentication and worker is 
-capable to send corresponding credentials with each request.
-
-#### Worker Side
-
-Workers communicate with the file server in both directions -- they download the
-submissions of the student and then upload evaluation results. Internally,
-worker is using libcurl C library with very similar setup. In both cases it can
-verify HTTPS certificate (on Linux against system cert list, on Windows against
-downloaded one from CURL website during installation), support basic HTTP
-authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error
-(returned HTTP status code is >=400). Worker have list of credentials to all
-available file servers in its config file.
-
-- download file -- standard HTTP GET request to given URL expecting file content
-  as response
-- upload file -- standard HTTP PUT request to given URL with file data as body
-  -- same as command line tool `curl` with option `--upload-file`
-
-#### File server side
-
-File server has its own internal directory structure, where all the files are
-stored. It provides simple REST API to get them or create new ones. File server
-does not provide authentication or secured connection by itself, but it is
-supposed to run file server as WSGI script inside a web server (like Apache)
-with proper configuration. Relevant commands for communication with workers:
-
-- **GET /submission_archives/\<id\>.\<ext\>** -- gets an archive with submitted
-  source code and corresponding configuration of this job evaluation
-- **GET /exercises/\<hash\>** -- gets a file, common usage is for input files or
-  reference result files
-- **PUT /results/\<id\>.\<ext\>** -- upload archive with evaluation results
-  under specified name (should be same _id_ as name of submission archive). On
-  successful upload returns JSON `{ "result": "OK" }` as body of returned page.
-
-If not specified otherwise, `zip` format of archives is used. Symbol `/` in API
-description is root of the domain of the file server. If the domain is for
-example `fs.recodex.org` with SSL support, getting input file for one task could
-look as GET request to
-`https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c`.
-
-
-### Broker - Monitor Communication
-
-Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
-socket is same on both sides, ROUTER. Monitor is set to act as server in this
-communication, its IP address and port are configurable in the config of the
-monitor file. ZeroMQ socket ID (set on the side of the monitor) is
-"recodex-monitor" and must be sent as first frame of every multipart message --
-see ZeroMQ ROUTER socket documentation for more info. 
-
-Note that the monitor is designed so that it can receive data both from the 
-broker and workers. The current architecture prefers the broker to do all the 
-communication so that the workers do not have to know too many network services.
-
-Monitor is treated as a somewhat optional part of whole solution, so no special 
-effort on communication reliability was made.
-
-#### Commands from Monitor to Broker:
-
-Because there is no need for the monitor to communicate with the broker, there 
-are no commands so far. Any message from monitor to broker is logged and 
-discarded.
-
-#### Commands from Broker to Monitor:
-
-- **progress** -- notification about progress with job evaluation. This
-  communication is only redirected as is from worker, more info can be found in
-  "Broker - Worker Communication" chapter above.
-
-
-### Broker - REST API Communication
-
-Broker communicates with main REST API through ZeroMQ connection over TCP.
-Socket type on broker side is ROUTER, on frontend part it is DEALER. Broker acts
-as a server, its IP address and port is configurable in the API.
-
-#### Commands from API to Broker:
-
-- **eval** -- evaluate a job. Requires at least 4 frames:
-    - `job_id` -- identifier of this job (in ASCII representation -- we avoid
-      endianness issues and also support alphabetic ids)
-    - `header` -- additional header describing worker capabilities. Format must
-      be `header_name=value`, every header shall be in a separate message frame.
-      There is no maximum limit on number of headers. There may be also no
-      headers at all. A worker is considered suitable for the job if and only if
-      it satisfies all of its headers.
-    - empty frame -- frame which contains only empty string and serves only as
-      breakpoint after headers
-    - `job_url` -- URI location of archive with job configuration and submitted
-      source code
-    - `result_url` -- remote URI where results will be pushed to
-
-#### Commands from Broker to API:
-
-All are responses to **eval** command.
-
-- **ack** -- this is first message which is sent back to frontend right after
-  eval command arrives, basically it means "Hi, I am all right and am capable of
-  receiving job requests", after sending this broker will try to find acceptable
-  worker for arrived request
-- **accept** -- broker is capable of routing request to a worker
-- **reject** -- broker cannot handle this job (for example when the requirements
-  specified by the headers cannot be met). There are (rare) cases when the
-  broker finds that it cannot handle the job after it was confirmed. In such
-  cases it uses the frontend REST API to mark the job as failed.
-
-#### Asynchronous Communication Between Broker And API
-
-Only a fraction of the errors that can happen during evaluation can be detected
-while there is a ZeroMQ connection between the API and broker. To notify the
-frontend of the rest, the API exposes an endpoint for the broker for this
-purpose. Broker uses this endpoint whenever the status of a job changes
-(it is finished, it failed permanently, the only worker capable of processing
-it disconnected...).
-
-When a request for sending a report arrives from the backend then the type of
-the report is inferred and if it is an error which deserves attention of the
-administrator then an email is sent to him/her. There can also be errors which
-are not that important (e.g., it was somehow solved by the backend itself or it
-is only informative), then these do not have to be reported through an email but
-they are stored in the persistent database for further consideration.
-
-For the details of this interface please refer to the attached API documentation
-and the `broker-reports/` endpoint group.
-
-### Fileserver - REST API Communication
-
-File server has a REST API for interaction with other parts of ReCodEx.
-Description of communication with workers is in "Worker - Fileserver
-Communication" chapter above. On top of that, there are other commands for
-interaction with the API:
-
-- **GET /results/\<id\>.\<ext\>** -- download archive with evaluated results of
-  job _id_
-- **POST /submissions/\<id\>** -- upload new submission with identifier _id_.
-  Expects that the body of the POST request uses file paths as keys and the
-  content of the files as values. On successful upload returns JSON `{
-  "archive_path": <archive_url>, "result_path": <result_url> }` in response
-  body. From _archive_path_ the submission can be downloaded (by worker) and
-  corresponding evaluation results should be uploaded to _result_path_.
-- **POST /tasks** -- upload new files, which will be available by names equal to
-  `sha1sum` of their content. There can be uploaded more files at once. On
-  successful upload returns JSON `{ "result": "OK", "files": <file_list> }` in
-  response body, where _file_list_ is dictionary of original file name as key
-  and new URL with already hashed name as value.
-
-There are no plans yet to support deleting files from this API. This may change
-in time.
-
-REST API calls these fileserver endpoints with standard HTTP requests. There are
-no special commands involved. There is no communication in opposite direction.
-
-### Monitor - Web App Communication
-
-Monitor interacts with web application through WebSocket connection. Monitor
-acts as server and browsers are connecting to it. IP address and port are
-configurable. When client connects to the monitor, it sends a message with
-string representation of channel id (which messages are interested in, usually
-id of evaluating job). There can be multiple listeners per channel, even
-(shortly) delayed connections will receive all messages from the very beginning.
-
-When monitor receives **progress** message from broker there are two options:
-
-- there is no WebSocket connection for listed channel (job id) -- message is
-  dropped
-- there is active WebSocket connection for listed channel -- message is parsed
-  into JSON format (see below) and send as string to that established channel.
-  Messages for active connections are queued, so no messages are discarded even
-  on heavy workload.
-
-Message from monitor to web application is in JSON format and it has form of
-dictionary (associative array). Information contained in this message should
-correspond with the ones given by worker to broker. For further description
-please read more in "Broker - Worker communication" chapter under "progress"
-command.
-
-Message format:
-
-- **command** -- type of progress, one of: DOWNLOADED, FAILED, UPLOADED,
-  STARTED, ENDED, ABORTED, FINISHED, TASK
-- **task_id** -- id of currently evaluated task. Present only if **command** is
-  "TASK".
-- **task_state** -- state of task with id **task_id**. Present only if
-  **command** is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".
-
-### Web App - REST API Communication
-
-The provided web application runs as a JavaScript process inside the browser of
-the user. It communicates with the REST API on the server through the standard
-HTTP requests. Documentation of the main REST API is in a separate
-[document](https://recodex.github.io/api/) due to its extensiveness. The results
-are returned encoded in JSON which is simply processed by the web application
-and presented to the user in an appropriate way.
-
-
-# Conclusion
-
-The project of ReCodEx was a great experience with developing a bigger
-application in team of people. We mostly enjoyed the time spent on the project,
-however the deadline was too tight to implement all nice features we thought of.
-Our implementation meets our predefined goals and is ready to be deployed
-instead of current tool CodEx.
-
-We made several design choices during planning and implementing the project.
-From perspective of today we are mostly happy about them and we believe we have
-done the best we could. One of the biggest questions was the broker/worker
-programming language. The winner was C++, which is really great language
-especially in revisions starting at C++11. But we are not sure if Python would
-not have been better option. From the list of good choices we would like to
-mention early unit testing with continuous integration, using ZeroMQ messaging
-framework and the design with splitting logic into multiple components.
-
-To sum up, we created an extensible base environment for years of further
-development by numerous student projects. The faculty would benefit from higher
-hardware utilization, less administration effort and a possibility to offer
-ReCodEx as SaaS to partner grammar schools. Students would also appreciate
-state-of-art development tools.
-
-During the development we were using whole bunch of development tools. We would
-like to thank authors, maintainers and contributors of these tools to help us
-complete this project. Most notably we would like to mention explicitly the top
-projects: Git and GitHub, Visual Studio Code, IDEs from JetBrains, ZeroMQ
-framework, C++ spdlog library, Travis and AppVeyor CI, Nette framework and many
-many others. Thank you.
-
-Finally, we would like to thank our supervisor, Martin Kruliš, for his support
-and great effort correcting text of this documentation.
-
-## Further Improvements
-
-A lot of work has been done, but it opened a whole bunch of new possibilities
-for subsequent student projects and other kinds of contribution. We would be
-happy to see people contributing to this project to make it even more awesome.
-We are presenting a brief list of features we think might be worth of
-completing. Surely, the list is not complete and may change in time.
-
-- Finish web frontend. In time of the project submission it does not contain all
-  features for the best user experience in all roles. This task is suitable for
-  people with basic web programming experience, React and Redux knowledge is an
-  advantage.
-- Create web editor of job configuration. ReCodEx job evaluation follows quite
-  complex task flow. To create valid configuration, author of exercise has to
-  write a pretty long YAML file by hand. This of course can be partially or
-  fully automated but was not implemented at time. This task therefore consist
-  of creating web based editor of this configuration with prepared task
-  templates allowing user to change their properties and connect them. After
-  submit, YAML job configuration should be properly created.
-- Write alternative commandline frontend. A lot of users want to submit their
-  solutions directly from command line. That is newly possible in ReCodEx
-  because of REST API, but no suitable frontend exists yet. There is unfinished
-  attempt to create one in NodeJS, accessible in ReCodEx organization on
-  [GitHub](https://github.com/ReCodEx/cli). The goal is to finish this project
-  or create alternative tool. We would like to see one written in Python.
-- Create mobile frontend. It would be really nice to have mobile application
-  when you can read exercise assignments and examine your progress and
-  statistics. All is possible due to ReCodEx REST API. The basic Android
-  application is in development by ReCodEx team as project for programming
-  mobile devices class. The code will be published as part of ReCodEx GitHub
-  organization.
-- Design and implement backend monitoring. For administrator would be great to
-  examine status of backend (broker, workers, monitor, fileserver) directly in
-  browser. Key information are which machines are online, offline or failed,
-  which worker is now running a job (maybe with current progress) and view
-  configuration of worker. Also some statistics with graphs should be made, for
-  example workload of workers. More advanced feature is the ability to restart
-  workers on the change of their configurations.
-- Finish .NET sandbox for Windows. We developed initial sandbox for Windows
-  environment, [WrapSharp](https://github.com/ReCodEx/wrapsharp). WrapSharp is
-  only for .NET platform assemblies and most notably for C# programs and cannot
-  be used generally for sandboxing on Windows. The goal is to finish
-  implementation and do a really detailed security audit of the project. Also,
-  integration to the worker is not fully done yet.
-- SIS integration. A very nice feature is (semi)automatic creation of groups and
-  assigning students to them depending on their timetable in student information
-  system (SIS). However there is no standardized API for such communication yet,
-  but we hope this will be possible in the future. Implementing this feature
-  means extending ReCodEx API by a SIS module in PHP.
-
-
-<!---
-// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
--->
-