You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
recodex-wiki/Rewritten-docs.md

3837 lines
199 KiB
Markdown

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Introduction
Generally, there are many different ways and opinions on how to teach people
something new. However, most people agree that a hands-on experience is one of
the best ways to make the human brain remember a new skill. Learning must be
entertaining and interactive, with fast and frequent feedback. Some kinds of
knowledge are more suitable for this practical type of learning than others, and
fortunately, programming is one of them.
University education system is one of the areas where this knowledge can be
applied. In computer programming, there are several requirements a program
should satisfy, such as the code being syntactically correct, efficient and easy
to read, maintain and extend.
Checking programs written by students takes time and requires a lot of
mechanical, repetitive work -- reviewing source codes, compiling them and
running them through testing scenarios. It is therefore desirable to automate as
much of this work as possible. The first idea of an automatic evaluation system
comes from Stanford University professors in 1965. They implemented a system
which evaluated code in Algol submitted on punch cards. In following years, many
similar products were written.
In the world of today, properties like correctness and efficiency can be tested
automatically to a large extent. This fact should be exploited to help teachers
save time for tasks such as examining bad design, bad coding habits and logical
mistakes, which are difficult to perform automatically.
There are two basic ways of automatically evaluating code -- statically
(checking the source code without running it; safe, but not very precise) or
dynamically (running the code on test inputs and checking the correctness of
outputs ones; provides good real world experience, but requires extensive
security measures).
This project focuses on the machine-controlled part of source code evaluation.
First, general concepts of grading systems are observed and problems of the
software previously used at Charles University in Prague are briefly discussed.
Then new requirements are specified and projects with similar functionality are
examined. With acquired knowledge from such projects in production, we set up
goals for the new evaluation system, designed the architecture and implemented a
fully operational solution based on dynamic evaluation. The system is now ready
for production testing at the university.
## Assignment
The major goal of this project is to create a grading application that will be
used for programming classes at the Faculty of Mathematics and Physics of the
Charles University in Prague. However, the application should be designed in a
modular fashion to be easily extended or even modified to make other ways of
usage possible.
The system should be capable of dynamic analysis of submitted source codes. This
consists of following basic steps:
1. compile the code and check for compilation errors
2. run compiled binary in a sandbox with predefined inputs
3. check constraints on used amount of memory and time
4. compare program outputs with predefined values
5. award the code with a numeric score
The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind the typical usage scenarios of
the system and to try to make these tasks as simple as possible. To fulfil this
task, the project has a great starting point -- there is an old grading system
currently used at the university (CodEx), so its flaws and weaknesses can be
addressed. Furthermore, many teachers desire to use and test the new system and
they are willing to consult ideas or problems during development with us.
## Current System
The grading solution currently used at the Faculty of Mathematics and Physics of
the Charles University in Prague was implemented in 2006 by a group of students.
It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
and it has been used with some improvements since then. The original plan was to
use the system only for basic programming courses, but there was a demand for
adapting it for many different subjects.
CodEx is based on dynamic analysis. It features a web-based interface, where
supervisors can assign exercises to their students and the students have a time
window to submit their solutions. Each solution is compiled and run in sandbox
(MO-Eval). The metrics which are checked are: correctness of the output, time
and memory limits. It supports programs written in C, C++, C#, Java, Pascal,
Python and Haskell.
The system has a database of users. Each user is assigned a role, which
corresponds to his/her privileges. There are user groups reflecting the
structure of lectured courses.
A database of exercises (algorithmic problems) is another part of the project.
Each exercise consists of a text describing the problem, an evaluation
configuration (machine-readable instructions on how to evaluate solutions to the
exercise), time and memory limits for all supported runtimes (e.g. programming
languages), a configuration for calculating the final score and a set of inputs
and reference outputs. Exercises are created by instructed privileged users.
Assigning an exercise to a group means choosing one of the available exercises
and specifying additional properties: a deadline (optionally a second deadline),
a maximum amount of points, a maximum number of submissions and a list of
supported runtime environments.
Typical use cases for supported user roles are following:
- **student**
- create new user account via registration form
- join a group
- get assignments in group
- submit solution to assignment -- upload one source file and trigger
evaluation process
- view solution results -- which parts succeeded and failed, total number of
acquired points, bonus points
- **supervisor** (similar to CodEx **operator**)
- create exercise -- create description text and evaluation configuration
(for each programming environment), upload testing inputs and outputs
- assign exercise to group -- choose exercise and set deadlines, number of
allowed submissions, weights of all testing cases and amount of points for
correct solutions
- modify assignment
- view all results in group
- check automatic solution grading -- view submitted source and optionally
set bonus points
- **administrator**
- create groups
- alter user privileges -- make supervisor accounts
- check system logs, upgrades and other management
### Exercise Evaluation Chain
The most important part of the system is evaluation of solutions submitted by
students. The process leading from source code to final results (score) is
described in more detail below to give readers a solid overview of what happens
during the evaluation process.
First thing students have to do is to submit their solutions through web user
interface. The system checks assignment invariants (deadlines, count of
submissions, ...) and stores the submitted code. The runtime environment is
automatically detected based on input file extension and a suitable evaluation
configuration variant is chosen (one exercise can have multiple variants, for
example C and Java languages). This exercise configuration is then used for
taking care of evaluation process.
There is a pool of uniform worker machines dedicated to evaluation jobs.
Incoming jobs are kept in a queue until a free worker picks them. Workers are
capable of sequential evaluation of jobs, one at a time.
The worker obtains the solution and its evaluation configuration, parses it and
starts executing the contained instructions. Each job should have more testing
cases, which examine wrong inputs, corner values and data of different sizes to
guess the program complexity. It is crucial to keep the worker computer secure
and stable, so a sandboxed environment is used for dealing with unknown source
code. When the execution is finished, results are saved and the submitter is
notified.
The output of the worker contains data about the evaluation, such as time and
memory spent on running the program for each test input and whether its output
was correct. The system then calculates a numeric score from this data, which is
presented to the student. If the solution is wrong (incorrect output, uses too
much memory,..), error messages are also displayed to the submitter.
### Possible Improvements
Current system is old, but robust. There were no major security incidents
during its production usage. However, from the perspective of today there are
several drawbacks. The main ones are:
- **web interface** -- The web interface is simple and fully functional.
However, recent rapid development in web technologies opens new horizons of
how web interfaces can be made.
- **web API** -- CodEx offers a very limited XML API based on outdated
technologies that is not sufficient for users who would like to create custom
interfaces such as a command line tool or mobile application.
- **sandboxing** -- MO-Eval sandbox is based on principle of monitoring system
calls and blocking the bad ones. This can be easily done for single-threaded
applications, but proves difficult with multi-threaded ones. In present day,
parallelism is a very important area of computing, so there is requirement to
test multi-threaded applications as well.
- **instances** -- Different ways of CodEx usage scenarios requires separate
installations (Programming I and II, Java, C#, etc.). This configuration is
not user friendly (students have to register in each installation separately)
and burdens administrators with unnecessary work. CodEx architecture does not
allow sharing workers between installations, which results in an inefficient
use of hardware for evaluation.
- **task extensibility** -- There is a need to test and evaluate complicated
programs for classes such as Parallel programming or Compiler principles,
which have a more difficult evaluation chain than simple
compilation/execution/evaluation provided by CodEx.
## Requirements
There are many different formal requirements for the system. Some of them
are necessary for any system for source code evaluation, some of them are
specific for university deployment and some of them arose during the ten year
long lifetime of the old system. There are not many ways to improve CodEx
experience from the perspective of a student, but a lot of feature requests come
from administrators and supervisors. The ideas were gathered mostly from our
personal experience with the system and from meetings with faculty staff
involved with the current system.
In general, CodEx features should be preserved, so only differences are
presented here. For clear arrangement all the requirements and wishes are
presented grouped by categories.
### Requirements of The Users
- _group hierarchy_ -- creating an arbitrarily nested tree structure should be
supported to allow keeping related groups together, such as in the example
below. CodEx supported only a flat group structure. A group hierarchy also
allows archiving data from past courses.
```
Summer term 2016
|-- Language C# and .NET platform
| |-- Labs Monday 10:30
| `-- Labs Thursday 9:00
|-- Programming I
| |-- Labs Monday 14:00
...
```
- _a database of exercises_ -- teachers should be able to filter viewed
exercises according to several criteria, for example supported runtime
environment or author. It should also be possible to link exercises to a group
so that groups supervisors do not have to browse hundreds of exercises when
their group only uses five of them
- _advanced exercises_ -- the system should support more advanced evaluation
pipeline than basic compilation/execution/evaluation which is in CodEx
- _customizable grading system_ -- teachers need to specify the way of
computation of the final score, which will be awarded to the submissions of
the student depending on their quality
- _marking a solution as accepted_ -- the system should allow marking one
particular solution as accepted (used for grading the assignment) by the
supervisor
- _solution resubmission_ -- teachers should be able edit the solutions of the
student and privately resubmit them, optionally saving all results (including
temporary ones); this feature can be used to quickly fix obvious errors in the
solution and see if it is otherwise viable
- _localization_ -- all texts (UI and exercises) should be translatable
- _formatted exercise texts_ -- Markdown or another lightweight markup language
should be supported for formatting exercise texts
- _comments_ -- adding both private and public comments to exercises, tests and
solutions should be supported
- _plagiarism detection_
### Administrative Requirements
- _pluggable user interface_ -- the system should allow using an alternative
user interface, such as a command line client; implementation of such clients
should be as straightforward as possible
- _privilege separation_ -- there should be at least two roles -- _student_ and
_supervisor_. Cases when a student of a course is also a teacher of another
lab must be handled correctly
- _alternate authentication methods_ -- logging in through a university
authentication system (e.g. LDAP) and potentially other services, such as
OAuth, should be supported
- _querying SIS_ -- loading user data from the university information system
should be supported
- _sandboxing_ -- there should be more advanced sandboxing which supports
execution of parallel programs and easy integration of different programming
environments and tools; the sandboxed environment should have the least
possible impact on measurement results (most importantly on measured times)
- _heterogeneous worker pool_ -- there must be support for submission evaluation
in multiple programming environments in a single installation to avoid
unacceptable workload for the administrator (maintaining a separate
installation for every course) and high hardware occupation
- advanced low-level evaluation flow configuration with high-level abstraction
layer for ordinary configuration cases; the configuration should be able to
express more complicated flows than just compiling a source code and running
the program against test inputs -- for example, some exercises need to build
the source code with a tool, run some tests, then run the program through
another tool and perform additional tests
- use of modern technologies with state-of-the-art compilers
### Non-functional Requirements
- _no installation_ -- the primary user interface of the system must be
accessible on the computers of the users without the need to install any
additional software except for a web browser (which is installed on a vast
majority of personal computers)
- _performance_ -- the system must be ready for at least hundreds of students
and tens of supervisors using it at once
- _automated deployment_ -- all of the components of the system must be easy to
deploy in an automated fashion
- _open source licensing_ -- the source code should be released under a
permissive licence allowing further development; this also applies to used
libraries and frameworks
- _multi-platform worker_ -- worker machines running Linux, Windows and
potentially other operating systems must be supported
### Conclusion
The survey shows that there are a lot of different requirements and wishes for
the new system. When the system is ready, it is likely that there will be new
ideas on how to use the system and thus the system must be designed to be easily
extendable, so that these new ideas can be easily implemented, either by us or
community members. This also means that widely used programming languages and
techniques should be used, so that users can quickly understand the code and
make changes.
## Related work
To find out the current state in the field of automatic grading systems, we did
a short market survey on the field of automatic grading systems at universities,
programming contests, and possibly other places where similar tools are
available.
This is not a complete list of available evaluators, but only a few projects
which are used these days and can be an inspiration for our project. Each
project from the list has a brief description and some key features mentioned.
### Progtest
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT
ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
Bash programming and knowledge-based quizzes. There are several bonus points
and penalties and also a few hints what is failing in the submitted solution. It
is very strict on source code quality, for example `-pedantic` option of GCC,
Valgrind for memory leaks or array boundaries checks via `mudflap` library.
### Codility
[Codility](https://codility.com/) is a web based solution primary targeted to
company recruiters. It is a commercial product available as a SaaS and it
supports 16 programming languages. The
[UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png)
of Codility is [opensource](https://github.com/Codility/cui), the rest of source
code is not available. One interesting feature is 'task timeline' -- captured
progress of writing code for each user.
### CMS
[CMS](http://cms-dev.github.io/index.html) is an opensource distributed system
for running and organizing programming contests. It is written in Python and
contains several modules. CMS supports C/C++, Pascal, Python, PHP, and Java
programming languages. PostgreSQL is a single point of failure, all modules
heavily depend on the database connection. Task evaluation can be only a three
step pipeline -- compilation, execution, evaluation. Execution is performed in
[Isolate](https://github.com/ioi/isolate), sandbox written by the consultant
of our project, Mgr. Martin Mareš, Ph.D.
### MOE
[MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C
and Python. It does not provide a default GUI interface, all actions have to be
performed from command line. The system does not evaluate submissions in real
time, results are computed in batch mode after exercise deadline, using Isolate
for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but
the system is generally obsolete.
### Kattis
[Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean
and functional web UI, but the rest of the application is too simple. A nice
feature is the usage of a [standardized
format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for
exercises. Kattis is primarily used by programming contest organizers, company
recruiters and also some universities.
# Analysis
None of the existing projects we came across fulfills all the requested features
for the new system. There is no grading system which supports arbitrary-length
evaluation pipeline, so we have to implement this feature ourselves, cautiously
treading through unexplored fields. Also, no existing solution is extensible
enough to be used as a base for the new system. After considering all these
facts, it is clear that a new system has to be written from scratch. This
implies that only a subset of all the features will be implemented in the first
version, and more of them will come in the following releases.
Gathered features are categorized based on priorities for the whole system. The
highest priority is the functionality present in current CodEx. It is a base
line for being useful in production environment. The design of the new solution
shall allow extending the system easily. Ideas from faculty staff have secondary
priority, but most of them will be implemented as part of the project. The most
complicated tasks from this category are advanced low-level evaluation
configuration format, using modern tools, connecting to a university systems and
merging separate system instances into a single one. Other tasks are scheduled
for next releases after the project is successfully defended. Namely, these are
high-level exercise evaluation configuration with a user-friendly interface for
common exercise types, SIS integration (when a public API becomes available for
the system) and a command-line submission tool. Plagiarism detection is not
likely to be part of any release in near future unless someone else implements a
sufficiently capable and extendable engine -- this problem is too complex to be
solved as a part of this project.
We named the new project **ReCodEx -- ReCodEx Code Examiner**. The name
should point to the old CodEx, but also reflect the new approach to solve
issues. **Re** as part of the name means redesigned, rewritten, renewed, or
restarted.
At this point there is a clear idea how the new system will be used and what are
the major enhancements for future releases. With this in mind, the overall
architecture can be sketched. To sum up, here is a list of key features of the
new system. They come from previous research of the current drawbacks of the system,
reasonable wishes of university users and our major design choices.
- modern HTML5 web frontend written in JavaScript using a suitable framework
- REST API communicating with database, evaluation backend and a file server
- evaluation backend implemented as a distributed system on top of a messaging
framework with master-worker architecture
- multi-platform worker supporting Linux and Windows environment (latter without
sandbox, no general purpose suitable tool available yet)
- evaluation procedure configured in a human readable text file, compound of
small tasks connected into an arbitrary oriented acyclic graph
The reasons supporting these decisions are explained in the rest of analysis
chapter. Also a lot of smaller design choices are mentioned including possible
options, what is picked to implement and why. But first, we discuss basic
concepts of the system.
## Basic Concepts
The system is designed as a web application. The requirements say that the user
interface must be accessible for students without the need to install additional
software. This immediately implies that users have to be connected to the
internet, so it is used as communication medium. Today, there are two main ways
of designing graphical user interfaces -- as a native application or a web page.
Creating a user-friendly and multi-platform application with graphical interface
is almost impossible because of the large number of different environments.
Also, these applications typically require installation or at least downloading
its files (sources or binaries). On the other hand, distributing a web
application is easier, because every personal computer has an internet browser
installed. Also, browsers support a (mostly) unified and standardized
environment of HTML5 and JavaScript. CodEx is also a web application and
everybody seems satisfied with this fact. There are other communicating channels
most programmers use, such as e-mail or git, but they are inappropriate for
designing user interfaces on top of them.
The application interacts with users. From the project assignment it is clear
that the system has to keep personalized data about users and adapt presented
content according to this knowledge. User data cannot be publicly visible, which
implies necessity of user authentication. The application also has to support
multiple ways of authentication (university authentication systems, a company
LDAP server, an OAuth server...) and permit adding more security measures in the
future, such as two-factor authentication.
User data also include a privilege level. From the assignment it is required to
have at least two roles, _student_ and _supervisor_. However, it is wise to add
_administrator_ level for users who take care of the system as a whole and is
responsible for core setup, monitoring, updates and so on. Student role has the
least power, basically can just view assignments and submit solutions.
Supervisors have more authority, so they can create exercises and assignments,
view results of students etc. From the university organization, one possible
level could be introduced, _course guarantor_. However, from real experience all
duties related with lecturing of labs are already associated with supervisors,
so this role does not seem useful. In addition, no one requested more than three
level privilege scheme.
School labs are lessons for some students lead by supervisors. All students ina
lab have the same homework and supervisors evaluate their solutions. This
organization has to be carried over into the new system. Virtual groups are a
counterpart to real-life labs. This concept was already discussed in the
previous chapter including the need for a hierarchical structure of groups. Only
students of the university recorded in the university information system have a
right to attend labs.
To allow restriction of group members in ReCodEx, there are two types of groups
-- _public_ and _private_. Public groups are open for all registered users, but
to become a member of private group, one of its supervisors have to add the
user. This could be done automatically at the beginning of the term with data
from information system, but unfortunately there is no API for this yet.
However, creating this API is now being considered by university staff. Another
just as good solution for restricting membership of a group is to allow anyone
join the group with supplementary confirmation of supervisors. It has no
additional benefits, so approach with public and private groups is implemented.
Supervisors using CodEx in their labs usually set minimum amount of points
required to get a credit. These points can be acquired by solving assigned
exercises. To show users whether they already have enough points, ReCodEx also
supports setting this limit for individual groups. There are two equal ways how
to set a limit -- absolute value or relative value to maximum. The latter way
seems nicer, so it is implemented. The relative value is set in percents and is
called threshold.
Our university has a few partner grammar schools. There was an idea, that they
could use CodEx for teaching informatics classes. To make the setup simple for
them, all the software and hardware would be provided by the university as a
completely ready-to-use remote service. However, CodEx is not prepared for this
kind of usage and no one has the time to manage a separate instance. With
ReCodEx it is possible to offer hosted environment as a service to other
subjects.
The concept we came up with is based on user and group separation inside the
system. The system is divided into multiple separated units called _instances_.
Each instance has own set of users and groups. Exercises can be optionally
shared. The rest of the system (API server and evaluation backend) is shared
between the instances. To keep track of active instances and paying customers,
each instance must have a valid _licence_ to allow users submit their solutions.
licence is granted for a definite period of time and can be revoked in advance
if the subject does not conform with approved terms and conditions.
The primary task of the system is to evaluate programming exercises. An
exercise is quite similar to a homework assignment during school labs. When a
homework is assigned, two things are important to know for users:
- description of the problem
- metadata -- when and whom to submit solutions, grading scale, penalties, etc.
To reflect this idea teachers and students are already familiar with, we decided
to keep separation between problem itself (_exercise_) and its _assignment_.
Exercise only describes one problem and provides testing data with description
of how to evaluate it. In fact, it is a template for assignments. Assignment
then contains data from its exercise and additional metadata, which can be
different for every assignment of the same exercise. This separation is natural
for all users, in CodEx it is implemented in similar way and no other
considerable solution was found.
### Evaluation Unit Executed by ReCodEx
One of the bigger requests for the new system is to support a complex
configuration of execution pipeline. The idea comes from lecturers of Compiler
principles class who want to migrate their semi-manual evaluation process to
CodEx. Unfortunately, CodEx is not capable of such complicated exercise setup.
None of evaluation systems we found can handle such task, so design from
scratch is needed.
There are two main approaches to design a complex execution configuration. It
can be composed of small amount of relatively big components or much more small
tasks. Big components are easy to write and help keeping the configuration
reasonably small. However, these components are designed for current problems
and they might not hold well against future requirements. This can be solved by
introducing a small set of single-purposed tasks which can be composed together.
The whole configuration becomes bigger, but more flexible for new conditions.
Moreover, they will not require as much programming effort as bigger evaluation
units. For better user experience, configuration generators for some common
cases can be introduced.
A goal of ReCodEx is to be continuously developed and used for many years.
Therefore, we chose to use smaller tasks, because this approach is better for
future extensibility. Observation of CodEx system shows that only a few tasks
are needed. In an extreme case, only one task is enough -- execute a binary.
However, for better portability of configurations between different systems it
is better to implement a reasonable subset of operations ourselves without
calling binaries provided by the system directly. These operations are copy
file, create new directory, extract archive and so on, altogether called
internal tasks. Another benefit from custom implementation of these tasks is
guarantied safety, so no sandbox needs to be used as in external tasks case.
For a job evaluation, the tasks need to be executed sequentially in a specified
order. Running independent tasks is possible, but there are complications --
exact time measurement requires a controlled environment with as few
interruptions as possible from other processes. It would be possible to run
tasks that do not need exact time measuremet in parallel, but in this case a
synchronization mechanism has to be developed to exclude paralellism for
measured tasks. Usually, there are about four times more unmeasured tasks than
tasks with time measurement, but measured tasks tend to be much longer. With
[Amdahl's law](https://en.wikipedia.org/wiki/Amdahl's_law) in mind, the
parallelism does not seem to provide a notable benefit in overall execution
speed and brings trouble with synchronization. Moreover, most of the internal
tasks are also limited by IO speed (most notably copying and downloading files
and reading archives). However, if there are performance issues, this approach
could be reconsiderred, along with using a ram disk for storing supplementary
files.
It seems that connecting tasks into directed acyclic graph (DAG) can handle all
possible problem cases. None of the authors, supervisors and involved faculty
staff can think of a problem that cannot be decomposed into tasks connected in a
DAG. The goal of evaluation is to satisfy as many tasks as possible. During
execution there are sometimes multiple choices of next task. To control that,
each task can have a priority, which is used as a secondary ordering criterion.
For better understanding, here is a small example.
![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png)
The _job root_ task is an imaginary single starting point of each job. When the
_CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should
be deterministic by position in configuration file -- tasks stated earlier
should be executed earlier). The task priorities guaranties, that after
_CompileA_ task all dependent tasks are executed before _CompileB_ task (they
have higher priority number). To sum up, connection of tasks represents
dependencies and priorities can be used to order unrelated tasks and with this
provide a total ordering of them. For well written jobs the priorities may not
be so useful, but they can help control execution order for example to avoid
situation, where each test of the job generates large temporary file and there
is a one valid execution order which keeps all the temporary files for later
processing at one time. Better approach is to finish execution of one test,
clean the big temporary file and proceed with following test. If there is an
ambiguity in task ordering at this point, they are executed in order of input
task configuration.
The total linear ordering of tasks can be made easier with just executing them
in order of input configuration. But this structure cannot handle cases, when a
task fails very well. There is no easy way of telling which task should be
executed next. However, this issue can be solved with graph structured
dependencies of the tasks. In graph structure, it is clear that all dependent
tasks have to be skipped and execution must be resumed with a non related task.
This is the main reason, why the tasks are connected in a DAG.
For grading there are several important tasks. First, tasks executing submitted
code need to be checked for time and memory limits. Second, outputs of judging
tasks need to be checked for correctness (represented by return value or by data
on standard output) and should not fail. This division can be transparent for
backend, each task is executed the same way. But frontend must know which tasks
from whole job are important and what is their kind. It is reasonable, to keep
this piece of information alongside the tasks in job configuration, so each task
can have a label about its purpose. Unlabeled tasks have an internal type
_inner_. There are four categories of tasks:
- _initiation_ -- setting up the environment, compiling code, etc.; for users
failure means error in their sources which are not compatible with running it
with examination data
- _execution_ -- running the user code with examination data, must not exceed
time and memory limits; for users failure means wrong design, slow data
structures, etc.
- _evaluation_ -- comparing user and examination outputs; for user failure means
that the program does not compute the right results
- _inner_ -- no special meaning for frontend, technical tasks for fetching and
copying files, creating directories, etc.
Each job is composed of multiple tasks of these types which are semantically
grouped into tests. A test can represent one set of examination data for user
code. To mark the grouping, another task label can be used. Each test must have
exactly one _evaluation_ task (to show success or failure to users) and
arbitrary number of tasks with other types.
### Evaluation Progress State
Users want to know the state of their submitted solution (whether it is waiting
in a queue, compiling, etc.). The very first idea would be to report a state
based on "done" messages from compilation, execution and evaluation like many
evaluation systems are already providing. However ReCodEx has a more complicated
execution pipeline where there can be more compilation or execution tasks per
test and also other internal tasks that control the job execution flow.
The users do not know the technical details of the evaluation and data about
completion of tasks may confuse them. A solution is to show users only
percentual completion of the job without any additional information about task
types. This solution works well for all of the jobs and is very user friendly.
It is possible to expand upon this by adding a special "send progress message"
task to the job configuration that would mark the completion of a specific part
of the evaluation. However, the benefits of this feature are not worth the
effort of implementing it and unnecessarily complicating the job configuration
files.
### Results of Evaluation
The evaluation data have to be processed and then presented in human readable
form. This is done through a one numeric value called points. Also, results of
job tests should be available to know what kind of error is in the solution. For
more debugging, outputs of tasks could be optionally available for the users.
#### Scoring and Assigning Points
The overall concept of grading solutions was presented earlier. To briefly
remind that, backend returns only exact measured values (used time and memory,
return code of the judging task, ...) and on top of that one value is computed.
The way of this computation can be very different across supervisors, so it has
to be easily extendable. The best way is to provide interface, which can be
implemented and any sort of magic can return the final value.
We found out several computational possibilities. There is basic arithmetic,
weighted arithmetic, geometric and harmonic mean of results of each test (the
result is logical value succeeded/failed, optionally with weight), some kind of
interpolation of used amount of time for each test, the same with used memory
amount and surely many others. To keep the project simple, we decided to design
appropriate interface and implement only weighted arithmetic mean computation,
which is used in about 90% of all assignments. Of course, different scheme can
be chosen for every assignment and also can be configured -- for example
specifying test weights for implemented weighted arithmetic mean. Advanced ways
of computation can be implemented on demand when there is a real demand for
them.
To avoid assigning points for insufficient solutions (like only printing "File
error" which is the valid answer in two tests), a minimal point threshold can be
specified. If the solution is to get less points than specified, it will get
zero points instead. This functionality can be embedded into grading computation
algoritm itself, but it would have to be present in each implementation
separately, which is not maintainable. Because of this the threshold feature is
separated from score computation.
Automatic grading cannot reflect all aspects of submitted code. For example,
structuring the code, number and quality of comments and so on. To allow
supervisors bring these manually checked things into grading, there is a concept
of bonus points. They can be positive or negative. Generally the solution with
the most assigned points is marked for grading that particular assignment.
However, if supervisor is not satisfied with student solution (really bad code,
cheating, ...) he/she assigns the student negative bonus points. To prevent
overriding this decision by system choosing another solution with more points or
even student submitting the same code again which evaluates to more points,
supervisor can mark a particular solution as marked and used for grading instead
of solution with the most points.
#### Evaluation Outputs
In addition to the exact measured values used for score calculation described in
previous chapter, there are also text or binary outputs of the executed tasks.
Knowing them helps users identify and solve their potential issues, but on the
other hand this can lead to possibility of leaking input data. This may lead
students to hack their solutions to pass just the ReCodEx testing cases instead
of properly solving the assigned problem. The usual approach is to keep this
information private. This was also strongly recommended by Martin Mareš, who has
experience with several programming contests.
The only one exception from hiding the logs are compilation outputs, which can
help students a lot during troubleshooting and there is only a small possibility
of input data leakage. The supervisors have access to all of the logs and they
can decide if students are allowed to see the compilation outputs.
Note that due to lack of frontend developers, showing compilation logs to the
students is not implemented in the very first release of ReCodEx.
### Persistence
Previous parts of analysis show that the system has to keep some state. This
could be user settings, group membership, evaluated assignments and so on. The
data have to be kept across restart, so persistence is important decision
factor. There are several ways how to save structured data:
- plain files
- NoSQL database
- relational database
Another important factor is amount and size of stored data. Our guess is about
1000 users, 100 exercises, 200 assignments per year and 20000 unique solutions
per year. The data are mostly structured and there are a lot of them with the
same format. For example, there is a thousand of users and each one has the same
values -- name, email, age, etc. These data items are relatively small, name
and email are short strings, age is an integer. Considering this, relational
databases or formatted plain files (CSV for example) fit best for them.
However, the data often have to support searching, so they have to be
sorted and allow random access for resolving cross references. Also, addition
and deletion of entries should take reasonable time (at most logarithmic time
complexity to number of saved values). This practically excludes plain files, so
we decided to use a relational database.
On the other hand, there is data with basically no structure and much larger
size. These can be evaluation logs, sample input files for exercises or sources
submitted by students. Saving this kind of data into a relational database is
not appropriate. It is better to keep them as ordinary files or store them in
some kind of NoSQL database. Since they are already files and do not need to be
backed up in multiple copies, it is easier to keep them as ordinary files in the
filesystem. Also, this solution is more lightweight and does not require
additional dependencies on third-party software. Files can be identified using
their filesystem paths or a unique index stored as a value in a relational
database. Both approaches are equally good, final decision depends on the actual
implementation.
## Structure of The Project
The ReCodEx project is divided into two logical parts -- the *backend* and the
*frontend* -- which interact with each other and together cover the whole area
of code examination. Both of these logical parts are independent of each other
in the sense of being installed on separate machines at different locations and
that one of the parts can be replaced with a different implementation and as
long as the communication protocols are preserved, the system will continue
working as expected.
### Backend
Backend is the part which is responsible solely for the process of evaluating
a solution of an exercise. Each evaluation of a solution is referred to as a
*job*. For each job, the system expects a configuration document of the job,
supplementary files for the exercise (e.g., test inputs, expected outputs,
predefined header files), and the solution of the exercise (typically source
codes created by a student). There might be some specific requirements for the
job, such as a specific runtime environment, specific version of a compiler or
the job must be evaluated on a processor with a specific number of cores. The
backend infrastructure decides whether it will accept a job or decline it based
on the specified requirements. In case it accepts the job, it will be placed in
a queue and it will be processed as soon as possible.
The backend publishes the progress of processing of the queued jobs and the
results of the evaluations can be queried after the job processing is finished.
The backend produces a log of the evaluation which can be used for further score
calculation or debugging.
To make the backend scalable, there are two necessary components -- the one
which will execute jobs and the other which will distribute jobs to the
instances of the first one. This ensures scalability in manner of parallel
execution of numerous jobs which is exactly what is needed. Implementation of
these services are called **broker** and **worker**, first one handles
distribution, the other one handles execution.
These components should be enough to fulfill all tasks mentioned above, but for
the sake of simplicity and better communication, gateways with frontend two
other components were added -- **fileserver** and **monitor**. Fileserver is a
simple component whose purpose is to store files which are exchanged between
frontend and backend. Monitor is also quite a simple service which is able to
forward job progress data from worker to web application. These two additional
services are at the border between frontend and backend (like gateways) but
logically they are more connected with backend, so it is considered they belong
there.
### Frontend
Frontend on the other hand is responsible for providing users with convenient
access to the backend infrastructure and interpreting raw data from backend
evaluation.
There are two main purposes of the frontend -- holding the state of the whole
system (database of users, exercises, solutions, points, etc.) and presenting
the state to users through some kind of a user interface (e.g., a web
application, mobile application, or a command-line tool). According to
contemporary trends in development of frontend parts of applications, we decided
to split the frontend in two logical parts -- a server side and a client side.
The server side is responsible for managing the state and the client side gives
instructions to the server side based on the inputs from the user. This
decoupling gives us the ability to create multiple client side tools which may
address different needs of the users.
The frontend developed as part of this project is a web application created with
the needs of the Faculty of Mathematics and Physics of the Charles university in
Prague in mind. The users are the students and their teachers, groups correspond
to the different courses, the teachers are the supervisors of these groups. This
model is applicable to the needs of other universities, schools, and IT
companies, which can use the same system for their needs. It is also possible to
develop a custom frontend with own user management system and use the
possibilities of the backend without any changes.
### Possible Connection
One possible configuration of ReCodEx system is illustrated on following
picture, where there is one shared backend with three workers and two separate
instances of whole frontend. This configuration may be suitable for MFF UK --
basic programming course and KSP competition. But maybe even sharing web API and
fileserver with only custom instances of client (web app or own implementation)
is more likely to be used. Note, that connections between components are not
fully accurate.
![Overall architecture](https://github.com/ReCodEx/wiki/blob/master/images/Overall_Architecture.png)
In the following parts of the documentation, both the backend and frontend parts
will be introduced separately and covered in more detail. The communication
protocol between these two logical parts will be described as well.
## Implementation Analysis
Some of the most important implementation problems or interesting observations
will be discussed in this chapter.
### Communication Between the Backend Components
Overall design of the project is discussed above. There are bunch of components
with their own responsibility. Important thing to design is communication of
these components. To choose a suitable protocol, there are some additional
requirements that should be met:
- reliability -- if a message is sent between components, the protocol has to
ensure that it is received by target component
- working over IP protocol
- multi-platform and multi-language usage
TCP/IP protocol meets these conditions, however it is quite low level and
working with it usually requires working with platform dependent non-object API.
Often way to reflect these reproaches is to use some framework which provides
better abstraction and more suitable API. We decided to go this way, so the
following options are considered:
- CORBA (or some other form of RPC) -- CORBA is a well known framework for
remote procedure calls. There are multiple implementations for almost every
known programming language. It fits nicely into object oriented programming
environment.
- RabbitMQ -- RabbitMQ is a messaging framework written in Erlang. It features a
message broker, to which nodes connect and declare the message queues they
work with. It is also capable of routing requests, which could be a useful
feature for job load-balancing. Bindings exist for a large number of languages
and there is a large community supporting the project.
- ZeroMQ -- ZeroMQ is another messaging framework, which is different from
RabbitMQ and others (such as ActiveMQ) because it features a "brokerless
design". This means there is no need to launch a message broker service to
which clients have to connect -- ZeroMQ based clients are capable of
communicating directly. However, it only provides an interface for passing
messages (basically vectors of 255B strings) and any additional features such
as load balancing or acknowledgement schemes have to be implemented on top of
this. The ZeroMQ library is written in C++ with a huge number of bindings.
CORBA is a large framework that would satisfy all our needs, but we are aiming
towards a more loosely-coupled system, and asynchronous messaging seems better
for this approach than RPC. Moreover, we rarely need to receive replies to our
requests immediately.
RabbitMQ seems well suited for many use cases, but implementing a job routing
mechanism between heterogenous workers would be complicated -- we would probably
have to create a separate load balancing service, which cancels the advantage of
a message broker already being provided by the framework. It is also written in
Erlang, which nobody from our team understands.
ZeroMQ is the best option for us, even with the drawback of having to implement
a load balancer ourselves (which could also be seen as a benefit and there is a
notable chance we would have to do the same with RabbitMQ). It also gives us
complete control over the transmitted messages and communication patterns.
However, all of the three options would have been possible to use.
### File Transfers
There has to be a way to access files stored on the fileserver (and also upload
them )from both worker and frontend server machines. The protocol used for this
should handle large files efficiently and be resilient to network failures.
Security features are not a primary concern, because all communication with the
fileserver will happen in an internal network. However, a basic form of
authentication can be useful to ensure correct configuration (if a development
fileserver uses different credentials than production, production workers will
not be able to use it by accident). Lastly, the protocol must have a client
library for platforms (languages) used in the backend. We will present some of
the possible options:
- HTTP(S) -- a de-facto standard for web communication that has far more
features than just file transfers. Thanks to being used on the web, a large
effort has been put into the development of its servers. It supports
authentication and it can handle short-term network failures (thanks to being
built on TCP and supporting resuming interrupted transfers). We will use HTTP
for communication with clients, so there is no added cost in maintaining a
server. HTTP requests can be made using libcurl.
- FTP -- an old protocol designed only for transferring files. It has all the
required features, but doesn't offer anything over HTTP. It is also supported
by libcurl.
- SFTP -- a file transfer protocol most frequently used as a subsystem of the
SSH protocol implementations. It doesn't provide authentication, but it
supports working with large files and resuming failed transfers. The libcurl
library supports SFTP.
- A network-shared file system (such as NFS) -- an obvious advantage of a
network-shared file system is that applications can work with remote files the
same way they would with local files. However, it brings an overhead for the
administrator, who has to configure access to this filesystem for every
machine that needs to access the storage.
- A custom protocol over ZeroMQ -- it is possible to design a custom file
transfer protocol that uses ZeroMQ for sending data, but it is not a trivial
task -- we would have to find a way to transfer large files efficiently,
implement an acknowledgement scheme and support resuming transfers. Using
ZeroMQ as the underlying layer does not help a lot with this. The sole
advantage of this is that the backend components would not need another
library for communication.
We chose HTTPS because it is widely used and client libraries exist in all
relevant environments. In addition, it is highly probable we will have to run an
HTTP server, because it is intended for ReCodEx to have a web frontend.
### Frontend - Backend Communication
Our choices when considering how clients will communicate with the backend have
to stem from the fact that ReCodEx should primarily be a web application. This
rules out ZeroMQ -- while it is very useful for asynchronous communication
between backend components, it is practically impossible to use it from a web
browser. There are several other options:
- *WebSockets* -- The WebSocket standard is built on top of TCP. It enables a
web browser to connect to a server over a TCP socket. WebSockets are
implemented in recent versions of all modern web browsers and there are
libraries for several programming languages like Python or JavaScript (running
in Node.js). Encryption of the communication over a WebSocket is supported as
a standard.
- *HTTP protocol* -- The HTTP protocol is a state-less protocol implemented on
top of the TCP protocol. The communication between the client and server
consists of a requests sent by the client and responses to these requests sent
back by the sever. The client can send as many requests as needed and it may
ignore the responses from the server, but the server must respond only to the
requests of the client and it cannot initiate communication on its own.
End-to-end encryption can be achieved easily using SSL (HTTPS).
We chose the HTTP(S) protocol because of the simple implementation in all sorts
of operating systems and runtime environments on both the client and the server
side.
The API of the server should expose basic CRUD (Create, Read, Update, Delete)
operations. There are some options on what kind of messages to send over the
HTTP:
- SOAP -- a protocol for exchanging XML messages. It is very robust and complex.
- REST -- is a stateless architecture style, not a protocol or a technology. It
relies on HTTP (but not necessarily) and its method verbs (e.g., GET, POST,
PUT, DELETE). It can fully implement the CRUD operations.
Even though there are some other technologies we chose the REST style over the
HTTP protocol. It is widely used, there are many tools available for development
and testing, and it is understood by programmers so it should be easy for a new
developer with some experience in client-side applications to get to know with
the ReCodEx API and develop a client application.
A high level view of chosen communication protocols in ReCodEx can be seen in
following image. Red arrows mark connections through ZeroMQ sockets, blue mark
WebSockets communication and green arrows connect nodes that communicate through
HTTP(S).
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
### Job Configuration File
As discussed previously in 'Evaluation Unit Executed by ReCodEx' an evaluation
unit have form of a job which contains small tasks representing one piece of
work executed by worker. This implies that jobs have to be passed from the
frontend to the backend. The best option for this is to use some kind of
configuration file which represents job details. The configuration file should
be specified in the frontend and in the backend, namely worker, will be parsed
and executed.
There are many formats which can be used for configuration representation. The
considered ones are:
- *XML* -- broadly used general markup language which is flavoured with document
type definition (DTD) which can express and check XML file structure, so it
does not have to be checked within application. But XML with its tags can be
sometimes quite 'chatty' and extensive which is not desirable. And overally
XML with all its features and properties can be a bit heavy-weight.
- *JSON* -- a notation which was developed to represent javascript objects. As
such it is quite simple, there can be expressed only: key-value structures,
arrays and primitive values. Structure and hierarchy of data is solved by
braces and brackets.
- *INI* -- very simple configuration format which is able to represents only
key-value structures which can be grouped into sections. This is not enough
to represent a job and its tasks hierarchy.
- *YAML* -- format which is very similar to JSON with its capabilities. But with
small difference in structure and hirarchy of configuration which is solved
not with braces but with indentation. This means that YAML is easily readable
by both human and machine.
- *specific format* -- newly created format used just for job configuration.
Obvious drawback is non-existing parsers which would have to be written from
scratch.
Given previous list of different formats we decided to use YAML. There are
existing parsers for most of the programming languages and it is easy enough to
learn and understand. Another choice which make sense is JSON but at the end
YAML seemed to be better.
Job configuration including design and implementation notes is described in 'Job
configuration' appendix.
#### Task Types
From the low-level point of view there are only two types of tasks in the job.
First ones are doing some internal operation which should work on all platforms
or operating systems the same way. Second type of tasks are external ones which
are executing external binary.
Internal tasks should handle at least these operations:
- *fetch* -- fetch single file from fileserver
- *copy* -- copy file between directories
- *remove* -- remove single file or folder
- *extract* -- extract files from downloaded archive
These internal operations are essential but many more can be eventually
implemented.
External tasks executing external binary should be optionally runnable in
sandbox. But for security sake there is no reason to execute them outside of
sandbox. So all external tasks are executed within a general a configurable
sandbox. Configuration options for sandboxes will be called limits and there can
be specified for example time or memory limits.
#### Configuration File Content
Content of the configuration file can be divided in two parts, first concerns
about the job in general and its metadata, second one relates to the tasks and
their specification.
There is not much to express in general job metadata. There can be
identification of the job and some general options, like enable/disable logging.
But really necessary item is address of the fileserver from where supplementary
files are downloaded. This option is crucial because there can be more
fileservers and the worker have no other way how to figure out where the files
might be.
More interesting situation is about the metadata of tasks. From the initial
analysis of evaluation unit and its structure there are derived at least these
generally needed items:
- *task identification* -- identificator used at least for specifying
dependencies
- *type* -- as described before, one of: 'initiation', 'execution', 'evaluation'
or 'inner'
- *priority* -- priority can additionally control execution flow in task graph
- *dependencies* -- necessary item for constructing hierarchy of tasks into DAG
- *execution command* -- command which should be executed withing this tasks
with possible parameters parameters
Previous list of items is applicable both for internal and external tasks.
Internal tasks do not need any more items but external do. Additional items are
exclusively related to sandboxing and limitation:
- *sandbox name* -- there should be possibility to have multiple sandboxes, so
identification of the right one is needed
- *limits* -- hardware and software resources limitations
- *time limit* -- limits time of execution
- *memory limit* -- maximum memory which can be consumed by external program
- *I/O operations* -- limitation concerning disk operations
- *restrict filesystem* -- restrict or enable access to directories
#### Supplementary Files
Interesting problem arise with supplementary files (e.g., inputs, sample
outputs). There are two main ways which can be observed. Supplementary files can
be downloaded either on the start of the execution or during the execution.
If the files are downloaded at the beginning, the execution does not really
started at this point and thus if there are problems with network, worker will
find it right away and can abort execution without executing a single task.
Slight problems can arise if some of the files needs to have specific name (e.g.
solution assumes that the input is `input.txt`). In this scenario the downloaded
files cannot be renamed at the beginning but during the execution which is
impractical and not easily observed by the authors of job configurations.
Second solution of this problem when files are downloaded on the fly has quite
opposite problem. If there are problems with network, worker will find it during
execution when for instance almost whole execution is done. This is also not
ideal solution if we care about burnt hardware resources. On the other hand
using this approach users have advanced control of the execution flow and know
what files exactly are available during execution which is from users
perspective probably more appealing then the first solution. Based on that,
downloading of supplementary files using 'fetch' tasks during execution was
chosen and implemented.
#### Job Variables
Considering the fact that jobs can be executed within the worker on different
machines with specific settings, it can be handy to have some kind of mechanism
in the job configuration which will hide these particular worker details, most
notably specific directory structure. For this purpose marks or signs can be
used and can have a form of broadly used variables.
Variables in general can be used everywhere where configuration values (not
keys) are expected. This implies that substitution should be done after parsing
of job configuration, not before. The only usage for variables which was
considered is for directories within worker, but in future this might be subject
to change.
Final form of variables is `${...}` where triple dot is textual description.
This format was used because of special dollar sign character which cannot be
used within paths of regular filesystems. Braces are there only to border
textual description of variable.
### Broker
The broker is responsible for keeping track of available workers and
distributing jobs that it receives from the frontend between them.
#### Worker Management
It is intended for the broker to be a fixed part of the backend infrastructure
to which workers connect at will. Thanks to this design, workers can be added
and removed when necessary (and possibly in an automated fashion), without
changing the configuration of the broker. An alternative solution would be
configuring a list of workers before startup, thus making them passive in the
communication (in the sense that they just wait for incoming jobs instead of
connecting to the broker). However, this approach comes with a notable
administration overhead -- in addition to starting a worker, the administrator
would have to update the worker list.
Worker management must also take into account the possibility of worker
disconnection, either because of a network or software failure (or termination).
A common way to detect such events in distributed systems is to periodically
send short messages to other nodes and expect a response. When these messages
stop arriving, we presume that the other node encountered a failure. Both the
broker and workers can be made responsible for initiating these exchanges and it
seems that there are no differences stemming from this choice. We decided that
the workers will be the active party that initiates the exchange.
#### Scheduling
Jobs should be scheduled in a way that ensures that they will be processed
without unnecessary waiting. This depends on the fairness of the scheduling
algorithm (no worker machine should be overloaded).
The design of such scheduling algorithm is complicated by the requirements on
the diversity of workers -- they can differ in operating systems, available
software, computing power and many other aspects.
We decided to keep the details of connected workers hidden from the frontend,
which should lead to a better separation of responsibilities and flexibility.
Therefore, the frontend needs a way of communicating its requirements on the
machine that processes a job without knowing anything about the available
workers. A key-value structure is suitable for representing such requirements.
With respect to these constraints, and because the analysis and design of a more
sophisticated solution was declared out of scope of our project assignment, a
rather simple scheduling algorithm was chosen. The broker shall maintain a queue
of available workers. When assigning a job, it traverses this queue and chooses
the first machine that matches the requirements of the job. This machine is then
moved to the end of the queue.
Presented algorithm results in a simple round-robin load balancing strategy,
which should be sufficient for small-scale deployments (such as a single
university). However, with a large amount of jobs, some workers will easily
become overloaded. The implementation must allow for a simple replacement of the
load balancing strategy so that this problem can be solved in the near future.
#### Forwarding Jobs
Information about a job can be divided in two disjoint parts -- what the worker
needs to know to process it and what the broker needs to forward it to the
correct worker. It remains to be decided how this information will be
transferred to its destination.
It is technically possible to transfer all the data required by the worker at
once through the broker. This package could contain submitted files, test
data, requirements on the worker, etc. A drawback of this solution is that
both submitted files and test data can be rather large. Furthermore, it is
likely that test data would be transferred many times.
Because of these facts, we decided to store data required by the worker using a
shared storage space and only send a link to this data through the broker. This
approach leads to a more efficient network and resource utilization (the broker
doesn't have to process data that it doesn't need), but also makes the job
submission flow more complicated.
#### Further Requirements
The broker can be viewed as a central point of the backend. While it has only
two primary, closely related responsibilities, other requirements have arisen
(forwarding messages about job evaluation progress back to the frontend) and
will arise in the future. To facilitate such requirements, its architecture
should allow simply adding new communication flows. It should also be as
asynchronous as possible to enable efficient communication with external
services, for example via HTTP.
### Worker
Worker is a component which is supposed to execute incoming jobs from broker. As
such worker should work and support wide range of different infrastructures and
maybe even platforms/operating systems. Support of at least two main operating
systems is desirable and should be implemented.
Worker as a service does not have to be very complicated, but a bit of complex
behaviour is needed. Mentioned complexity is almost exclusively concerned about
robust communication with broker which has to be regularly checked. Ping
mechanism is usually used for this in all kind of projects. This means that the
worker should be able to send ping messages even during execution. So worker has
to be divided into two separate parts, the one which will handle communication
with broker and the another which will execute jobs.
The easiest solution is to have these parts in separate threads which somehow
tightly communicates with each other. For inter process communication there can
be used numerous technologies, from shared memory to condition variables or some
kind of in-process messages. The ZeroMQ library which we already use provides
in-process messages that work on the same principles as network communication,
which is convenient and solves problems with thread synchronization.
#### Execution of Jobs
At this point we have worker with two internal parts listening one and execution
one. Implementation of first one is quite straightforward and clear. So let us
discuss what should be happening in execution subsystem.
After successful arrival of the job from broker to the listening thread, the job
is immediatelly redirected to execution thread. In there worker has to prepare
new execution environment, solution archive has to be downloaded from fileserver
and extracted. Job configuration is located within these files and loaded into
internal structures and executed. After that, results are uploaded back to
fileserver. These steps are the basic ones which are really necessary for whole
execution and have to be executed in this precise order.
The evaluation unit executed by ReCodEx and job configuration were already
discussed above. The conclusion was that jobs containing small tasks will be
used. Particular format of the actual job configuration can be found in 'Job
configuration' appendix. Implementation of parsing and storing these data in
worker is then quite straightforward.
Worker has internal structures to which loads and which stores metadata given in
configuration. Whole job is mapped to job metadata structure and tasks are
mapped to either external ones or internal ones (internal commands has to be
defined within worker), both are different whether they are executed in sandbox
or as an internal worker commands.
#### Task Execution Failure
Another division of tasks is by task-type field in configuration. This field can
have four values: initiation, execution, evaluation and inner. All was discussed
and described above in evaluation unit analysis. What is important to worker is
how to behave if execution of task with some particular type fails.
There are two possible situations execution fails due to bad user solution or
due to some internal error. If execution fails on internal error solution cannot
be declared overly as failed. User should not be punished for bad configuration
or some network error. This is where task types are useful.
Initiation, execution and evaluation are tasks which are usually executing code
which was given by users who submitted solution of exercise. If this kinds of
tasks fail it is probably connected with bad user solution and can be evaluated.
But if some inner task fails solution should be re-executed, in best case
scenario on different worker. That is why if inner task fails it is sent back to
broker which will reassign job to another worker. More on this subject should be
discussed in broker assigning algorithms section.
#### Job Working Directories
There is also question about working directory or directories of job, which
directories should be used and what for. There is one simple answer on this
every job will have only one specified directory which will contain every file
with which worker will work in the scope of whole job execution. This solution
is easy but fails due to logical and security reasons.
The least which must be done are two folders one for internal temporary files
and second one for evaluation. The directory for temporary files is enough to
comprehend all kind of internal work with filesystem but only one directory for
whole evaluation is somehow not enough.
The solution which was chosen at the end is to have folders for downloaded
archive, decompressed solution, evaluation directory in which user solution is
executed and then folders for temporary files and for results and generally
files which should be uploaded back to fileserver with solution results.
There has to be also hierarchy which separate folders from different workers on
the same machines. That is why paths to directories are in format:
`${DEFAULT}/${FOLDER}/${WORKER_ID}/${JOB_ID}` where default means default
working directory of whole worker, folder is particular directory for some
purpose (archives, evaluation, ...).
Mentioned division of job directories proved to be flexible and detailed enough,
everything is in logical units and where it is supposed to be which means that
searching through this system should be easy. In addition if solutions of users
have access only to evaluation directory then they do not have access to
unnecessary files which is better for overall security of whole ReCodEx.
### Sandboxing
There are numerous ways how to approach sandboxing on different platforms,
describing all possible approaches is out of scope of this document. Instead of
that have a look at some of the features which are certainly needed for ReCodEx
and propose some particular sandboxes implementations on Linux or Windows.
General purpose of sandbox is safely execute software in any form, from scripts
to binaries. Various sandboxes differ in how safely are they and what limiting
features they have. Ideal situation is that sandbox will have numerous options
and corresponding features which will allow administrators to setup environment
as they like and which will not allow user programs to somehow damage executing
machine in any way possible.
For ReCodEx and its evaluation there is need for at least these features:
execution time and memory limitation, disk operations limit, disk accessibility
restrictions and network restrictions. All these features if combined and
implemented well are giving pretty safe sandbox which can be used for all kinds
of users solutions and should be able to restrict and stop any standard way of
attacks or errors.
#### Linux
Linux systems have quite extent support of sandboxing in kernel, there were
introduced and implemented kernel namespaces and cgroups which combined can
limit hardware resources (cpu, memory) and separate executing program into its
own namespace (pid, network). These two features comply sandbox requirement for
ReCodEx so there were two options, either find existing solution or implement
new one. Luckily existing solution was found and its name is **isolate**.
Isolate does not use all possible kernel features but only subset which is still
enough to be used by ReCodEx.
#### Windows
The opposite situation is in Windows world, there is limited support in its
kernel which makes sandboxing a bit trickier. Windows kernel only has ways how
to restrict privileges of a process through restriction of internal access
tokens. Monitoring of hardware resources is not possible but used resources can
be obtained through newly created job objects.
There are numerous sandboxes for Windows but they all are focused on different
things in a lot of cases they serves as safe environment for malicious programs,
viruses in particular. Or they are designed as a separate filesystem namespace
for installing a lot of temporarily used programs. From all these we can
mention: Sandboxie, Comodo Internet Security, Cuckoo sandbox and many others.
None of these is fitted as sandbox solution for ReCodEx. With this being said we
can safely state that designing and implementing new general sandbox for Windows
is out of scope of this project.
But designing sandbox only for specific environment is possible, namely for C#
and .NET. CLR as a virtual machine and runtime environment has a pretty good
security support for restrictions and separation which is also transferred to
C#. This makes it quite easy to implement simple sandbox within C# but there are
not any well known general purpose implementations.
As mentioned in previous paragraphs implementing our own solution is out of
scope of project. But C# sandbox is quite good topic for another project for
example term project for C# course so it might be written and integrated in
future.
### Fileserver
The fileserver provides access over HTTP to a shared storage space that contains
files submitted by students, supplementary files such as test inputs and outputs
and results of evaluation. In other words, it acts as an intermediate storage
node for data passed between the frontend and the backend. This functionality
can be easily separated from the rest of the backend features, which led to
designing the fileserver as a standalone component. Such design helps
encapsulate the details of how the files are stored (e.g. on a file system, in a
database or using a cloud storage service), while also making it possible to
share the storage between multiple ReCodEx frontends.
For early releases of the system, we chose to store all files on the file system
-- it is the least complicated solution (in terms of implementation complexity)
and the storage backend can be rather easily migrated to a different technology.
One of the facts we learned from CodEx is that many exercises share test input
and output files, and also that these files can be rather large (hundreds of
megabytes). A direct consequence of this is that we cannot add these files to
submission archives that are to be downloaded by workers -- the combined size of
the archives would quickly exceed gigabytes, which is impractical. Another
conclusion we made is that a way to deal with duplicate files must be
introduced.
A simple solution to this problem is storing supplementary files under the
hashes of their content. This ensures that every file is stored only once. On
the other hand, it makes it more difficult to understand what the content of a
file is at a glance, which might prove problematic for the administrator.
However, human-readable identification is not as important as removing
duplicates -- administrators rarely need to inspect stored files (and when they
do, they should know their hashes), but duplicate files occupied a large part of
the disk space used by CodEx.
A notable part of the work of the fileserver is done by a web server (e.g.
listening to HTTP requests and caching recently accessed files in memory for
faster access). What remains to be implemented is handling requests that upload
files -- student submissions should be stored in archives to facilitate simple
downloading and supplementary exercise files need to be stored under their
hashes.
We decided to use Python and the Flask web framework. This combination makes it
possible to express the logic in ~100 SLOC and also provides means to run the
fileserver as a standalone service (without a web server), which is useful for
development.
### Cleaner
Worker can use caching mechanism based on files from fileserver under one
condition, provided files has to have unique name. This means there has to be
system which can download file, store it in cache and after some time of
inactivity delete it. Because there can be multiple worker instances on some
particular server it is not efficient to have this system in every worker on its
own. So it is feasible to have this feature somehow shared among all workers on
the same machine.
Solution may be again having separate service connected through network with
workers which would provide such functionality, but this would mean component
with another communication for the purpose, where it is not exactly needed. But
mainly it would be single-failure component. If it would stop working then it is
quite a problem.
So there was chosen another solution which assumes worker has access to
specified cache folder. In there folder worker can download supplementary files
and copy them from here. This means every worker has the possibility to maintain
downloads to cache, but what is worker not able to properly do, is deletion of
unused files after some time.
#### Architecture
For that functionality single-purpose component is introduced which is called
'cleaner'. It is simple script executed within cron which is able to delete
files which were unused for some time. Together with worker fetching feature
cleaner completes particular server specific caching system.
Cleaner as mentioned is simple script which is executed regularly as a cron job.
If there is caching system like it was introduced in paragraph above there are
little possibilities how cleaner should be implemented.
On various filesystems there is usually support for two particular timestamps,
`last access time` and `last modification time`. Files in cache are once
downloaded and then just copied, this means that last modification time is set
only once on creation of file and last access time should be set every time on
copy. From this we can conclude that last access time is what is needed here.
But unlike last modification time, last access time is not usually enabled on
conventional filesystems (more on this subject can be found
[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)).
So if we choose to use last access time, filesystem used for cache folder has to
have last access time for files enabled. Last access time was chosen for
implementation in ReCodEx but this might change in further releases.
However, there is another way, last modification time which is broadly supported
can be used. But this solution is not automatic and worker would have to 'touch'
cache files whenever they are accessed. This solution is maybe a bit better than
the one with last access time and might be implemented in future releases.
#### Caching Flow
Having cleaner as separated component and caching itself handled in worker is
kind of blurry and is not clearly observable that it works without problems.
The goal is to have system which can recover from every kind of errors.
Follows description of one possible implementation. This whole mechanism relies
on worker ability to recover from internal fetch task failure. In case of error
here job will be reassigned to another worker where problem hopefully does not
arise.
First start with worker implementation:
- worker discovers fetch task which should download supplementary file
- worker takes name of file and tries to copy it from cache folder to its
working folder
- if successful then last access time should be rewritten (by filesystem
itself) and whole operation is done
- if not successful then file has to be downloaded
- file is downloaded from fileserver to working folder and then
copied to cache
Previous implementation is only within worker, cleaner can anytime intervene and
delete files. Implementation in cleaner follows:
- cleaner on its start stores current reference timestamp which will be used for
comparison and load configuration values of caching folder and maximal file
age
- there is a loop going through all files and even directories in specified
cache folder
- if difference between last access time and reference timestamp is greater
than specified maximal file age, then file or folder is deleted
Previous description implies that there is gap between detection of last access
time and deleting file within cleaner. In the gap there can be worker which will
access file and the file is anyway deleted but this is fine, file is deleted but
worker has it copied. If worker does not copy whole file or even do not start to
copy it and the file is deleted then copy process will fail. This will cause
internal task failure which will be handled by reassigning job to another
worker.
Another problem can be with two workers downloading the same file, but this is
also not a problem, file is firstly downloaded to working folder and after that
copied to cache.
And even if something else unexpectedly fails and because of that fetch task
will fail during execution, even that should be fine as mentioned previously.
Reassigning of job should be the last salvation in case everything else goes
wrong.
### Monitor
Users want to view real time evaluation progress of their solution. It can be
easily done with established double-sided connection stream, but it is hard to
achieve with plain HTTP. HTTP itself works on a separate request basis with no
long term connection. The HTML5 specification contains Server-Sent Events - a
means of sending text messages unidirectionally from an HTTP server to a
subscribed website. Sadly, it is not supported in Internet Explorer and Edge.
However, there is another widely used technology that can solve this problem --
the WebSocket protocol. It is more general than necessary (it enables
bidirectional communication) and requires additional web server configuration,
but it is supported in recent versions of all major web browsers.
Working with the WebSocket protocol from the backend is possible, but not ideal
from the design point of view. Backend should be hidden from public internet to
minimize surface for possible attacks. With this in mind, there are two possible
options:
- send progress messages through the API
- make a separate component that forwards progress messages to clients
Both of the two possibilities have their benefits and drawbacks. The first one
requires no additional component and the API is already publicly visible. On the
other side, working with WebSockets from PHP is complicated (but it is possible
with the help of third-party libraries) and embedding this functionality into
API is not extendable. The second approach is better for future changing the
protocol or implementing extensions like caching of messages. Also, the progress
feature is considered only optional, because there may be clients for which this
feature is useless. Major drawback of separate component is another part, which
needs to be publicly exposed.
We decided to make a separate component, mainly because it is smaller component
with only one role, better maintainability and optional demands for progress
callback.
There are several possibilities how to write the component. Notably, considered
options were already used languages C++, PHP, JavaScript and Python. At the end,
the Python language was chosen for its simplicity, great support for all used
technologies and also there are free Python developers in out team.
### API Server
The API server must handle HTTP requests and manage the state of the application
in some kind of a database. The API server will be a RESTful service and will
return data encoded as JSON documents. It must also be able to communicate with
the backend over ZeroMQ.
We considered several technologies which could be used:
- PHP + Apache -- one of the most widely used technologies for creating web
servers. It is a suitable technology for this kind of a project. It has all
the features we need when some additional extensions are installed (to support
LDAP or ZeroMQ).
- Ruby on Rails, Python (Django), etc. -- popular web technologies that appeared
in the last decade. Both support ZeroMQ and LDAP via extensions and have large
developer communities.
- ASP.NET (C#), JSP (Java) -- these technologies are very robust and are used to
create server technologies in many big enterprises. Both can run on Windows
and Linux servers (ASP.NET using the .NET Core).
- JavaScript (Node.js) -- it is a quite new technology and it is being used to
create REST APIs lately. Applications running on Node.js are quite performant
and the number of open-source libraries available on the Internet is very
huge.
We chose PHP and Apache mainly because we were familiar with these technologies
and we were able to develop all the features we needed without learning to use a
new technology. Since the number of features was quite high and needed to meet a
strict deadline. This does not mean that we would find all the other
technologies superior to PHP in all other aspects - PHP 7 is a mature language
with a huge community and a wide range of tools, libraries, and frameworks.
We decided to use an ORM framework to manage the database, namely the widely
used PHP ORM Doctrine 2. Using an ORM tool means we do not have to write SQL
queries by hand. Instead, we work with persistent objects, which provides a
higher level of abstraction. Doctrine also has a robust database abstraction
layer so the database engine is not very important and it can be changed without
any need for changing the code. MariaDB was chosen as the storage backend.
To speed up the development process of the PHP server application we decided to
use a web framework. After evaluating and trying several frameworks, such as
Lumen, Laravel, and Symfony, we ended up using Nette.
- **Lumen** and **Laravel** seemed promissing but the default ORM framework
Eloquent is an implementation of ActiveRecord which we wanted to avoid. It
was also suprisingly complicated to implement custom middleware for validation
of access tokens in the headers of incoming HTTP requests.
- **Symfony** is a very good framework and has Doctrine "built-in". The reason
why we did not use Symfony in the end was our lack of experience with this
framework.
- **Nette framework** is very popular in the Czech Republic -- its lead
developer is a well-known Czech programmer David Grudl. We were already
familiar with the patterns used in this framework, such as dependency
injection, authentication, routing. These concepts are useful even when
developing a REST application which might be a surprise considering that
Nette focuses on "traditional" web applications.
Nette is inspired by Symfony and many of the Symfony bundles are available
as components or extensions for Nette. There is for example a Nette
extension which makes integration of Doctrine 2 very straightforward.
#### Architecture of The System
The Nette framework is an MVP (Model, View, Presenter) framework. It has many
tools for creating complex websites and we need only a subset of them or we use
different libraries which suite our purposes better:
- **Model** - the model layer is implemented using the Doctrine 2 ORM instead of
Nette Database
- **View** - the whole view layer of the Nette framework (e.g., the Latte engine
used for HTML template rendering) is unnecessary since we will return all the
responses encoded in JSON. JSON is a common format used in APIs and we decided
to prefer it to XML or a custom format.
- **Presenter** - the whole lifecycle of a request processing of the Nette
framework is used. The Presenters are used to group the logic of the individual
API endpoints. The routing mechanism is modified to distinguish the actions by
both the URL and the HTTP method of the request.
#### Authentication
To make certain data and actions acessible only for some specific users, there
must be a way how these users can prove their identity. We decided to avoid PHP
sessions to make the server stateless (session ID is stored in the cookies of
the HTTP requests and responses). The server issues a specific token for the
user after his/her identity is verified (i.e., by providing email and password)
and sent to the client in the body of the HTTP response. The client must
remember this token and attach it to every following request in the
*Authorization* header.
The token must be valid only for a certain time period ("log out" the user after
a few hours of inactivity) and it must be protected against abuse (e.g., an
attacker must not be able to issue a token which will be considered valid by the
system and using which the attacker could pretend to be a different user). We
decided to use the JWT standard (the JWS).
The JWT is a base64-encoded string which contains three JSON documents - a
header, some payload, and a signature. The interesting parts are the payload and
the signature: the payload can contain any data which can identify the user and
metadata of the token (i.e., the time when the token was issued, the time of
expiration). The last part is a digital signature contains a digital signature
of the header and payload and it ensures that nobody can issue their own token
and steal the identity of someone. Both of these characteristics give us the
opportunity to validate the token without storing all of the tokens in the
database.
To implement JWT in Nette, we have to implement some of its security-related
interfaces such as IAuthenticator and IUserStorage, which is rather easy thanks
to the simple authentication flow. Replacing these services in a Nette
application is also straightforward, thanks to its dependency injection
container implementation. The encoding and decoding of the tokens itself
including generating the signature and signature verification is done through a
widely used third-party library which lowers the risk of having a bug in the
implementation of this critical security feature.
##### Backend Monitoring
The next thing related to communication with the backend is monitoring its
current state. This concerns namely which workers are available for processing
different hardware groups and which languages can be therefore used in
exercises.
Another step would be the overall backend state like how many jobs were
processed by some particular worker, workload of the broker and the workers,
etc. The easiest solution is to manage this information by hand, every instance
of the API server has to have an administrator which would have to fill them.
This includes only the currently available workers and runtime
environments which does not change very often. The real-time statistics of the
backend cannot be made accessible this way in a reasonable way.
A better solution is to update this information automatically. This can be
done in two ways:
- It can be provided by the backend on-demand if API needs it
- The backend will send these information periodically to the API.
Things like currently available workers or runtime environments are better to be
really up-to-date so this could be provided on-demand if needed. Backend
statistics are not that necessary and could be updated periodically.
However due to the lack of time automatic monitoring of the backend state will
not be implemented in the early versions of this project but might be
implemented in some of the next releases.
### Web Application
The web application ("WebApp") is one of the possible client applications of the
ReCodEx system. Creating a web application as the first client application has
several advantages:
- no installation or setup is required on the device of the user
- works on all platforms including mobile devices
- when a new version is released, all the clients will use this version without
any need for manual instalation of the update
One of the downsides is the large number of different web browsers (including
the older versions of a specific browser) and their different interpretation
of the code (HTML, CSS, JS). Some features of the latest specifications of HTML5
are implemented in some browsers which are used by a subset of the Internet
users. This has to be taken into account when choosing appropriate tools
for implementation of a website.
There are two basic ways how to create a website these days:
- **server-side approach** - the actions of the user are processed on the server
and the HTML code with the results of the action is generated on the server
and sent back to the web browser of the user. The client does not handle any
logic (apart from rendering of the user interface and some basic user
interaction) and is therefore very simple. The server can use the API server
for processing of the actions so the business logic of the server can be very
simple as well. A disadvantage of this approach is that a lot of redundant
data is transferred across the requests although some parts of the content can
be cached (e.g., CSS files). This results in longer loading times of the
website.
- **server-side rendering with asynchronous updates (AJAX)** - a slightly
different approach is to render the page on the server as in the previous case
but then execute the actions of the user asynchronously using the
`XMLHttpRequest` JavaScript functionality. Which creates a HTTP request and
transfers only the part of the website which will be updated.
- **client-side approach** - the opposite approach is to transfer the
communication with the API server and the rendering of the HTML completely
from the server directly to the client. The client runs the code (usually
JavaScript) in his/her web browser and the content of the website is generated
based on the data received from the API server. The script file is usually
quite large but it can be cached and does not have to be downloaded from the
server again (until the cached file expires). Only the data from the API
server needs to be transferred over the Internet and thus reduce the volume of
payload on each request which leads to a much more responsive user experience,
especially on slower networks. Since the client-side code has full control
over the UI and a more sophisticated user interactions with the UI can be
achieved.
All of these are used in production by the web developers and all
of them are well documented and there are mature tools for creating websites
using any of these approaches.
We decided to use the third approach -- to create a fully client-side
application which would be familiar and intuitive for a user who is used to
modern web applications.
#### Used Technologies
We examined several frameworks which are commonly used to speed up the
development of a web application. There are several open source options
available with a large number of tools, tutorials, and libraries. From the many
options (Backbone, Ember, Vue, Cycle.js, ...) there are two main frameworks
worth considering:
- **Angular 2** - it is a new framework which was developed by Google. This
framework is very complex and provides the developer with many tools which
make creating a website very straigtforward. The code can be written in pure
JavaScript (ES5) or using the TypeScript language which is then transpiled
into JavaScript. Creating a web application in Angular 2 is based on creating
and composing components. The previous version of Angular is not compatible
with this new version.
- **React and Redux** - [React](https://facebook.github.io/react) is a fast
library for rendering of the user interface developed by Facebook. It is based
on components composition as well. A React application is usually written in
EcmaScript 6 and the JSX syntax for defining the component tree. This code is
usually transpiled to JavaScript (ES5) using some kind of a transpiler like
Babel. [Redux](http://redux.js.org/) is a library for managing the state of
the application and it implements a modification of the so-called Flux
architecture introduced by Facebook. React and Redux are being used for a
longer time than Angular 2 and both are still actively developed. There are
many open-source components and addons available for both React and Redux.
We decided to use React and Redux over Angular 2 for several reasons:
- There is a large community arround these libraries and there is a large number
of tutorials, libraries, and other resources available online.
- Many of the web frontend developers are familiar with React and Redux and
contributing to the project should be easy for them.
- A stable version of Angular 2 was still not released at the time we started
developing the web application.
- We had previous experience with React and Redux and Angular 2 did not bring
any significant improvements and features over React so it would not be worth
learning the paradigms of a new framework.
- It is easy to debug React component tree and Redux state transitions
using extensions for Google Chrome and Firefox.
#### User Interface Design
There is no artist on the team so we had to come up with an idea how to create a
visually appealing application with this handicap. User interfaces created by
programmers are notoriously ugly and unintuitive. Luckily we found the
[AdminLTE](https://almsaeedstudio.com/) theme by Abdullah Almsaeed which is
built on top of the [Bootstrap framework](http://getbootstrap.com/) by Twitter.
This is a great combination because there is an open-source implementation of
the Bootstrap components for React and with the stylesheets from AdminLTE the
application looks good and is distingushable form the many websites using the
Bootstrap framework with very little work.
# User Documentation
Users interact with the ReCodEx through the web application. It is required to
use a modern web browser with good HTML5 and CSS3 support. Among others, cookies
and local storage are used. Also a decent JavaScript runtime must be provided by
the browser.
Supported and tested browsers are: Firefox 50+, Chrome 55+, Opera 42+ and Edge
13+. Mobile devices often have problems with internationalization and possibly
lack support for some common features of desktop browsers. In this stage of
development is not possible for us to fine tune the interface for major mobile
browsers on all mobile platforms. However, it is confirmed to work with latest
Google Chrome and Gello browser on Android 7.1+. Issues have been reported with
Firefox that will be fixed in the future. Also, it is confirmed to work with
Safari browser on iOS 10.
Usage of the web application is divided into sections concerning particular user
roles. Under these sections all possible use cases can be found. These sections
are inclusive, so more privileged users need to read instructions for all less
privileged users. Described roles are:
- Student
- Group supervisor
- Group administrator
- Instance administrator
- Superadministrator
## Terminology
**Instance** -- Represents a university, company or some other organization
unit. Multiple instances can exist in a single ReCodEx installation.
**Group** -- A group of students to which exercises are assigned by a
supervisor. It should typically correspond with a real world lab group.
**User** -- A person that interacts with the system using the web interface (or
an alternative client).
**Student** -- A user with least privileges who is subscribed to some groups and
submits solutions to exercise assignments.
**Supervisor** -- A person responsible for assigning exercises to a group and
reviewing submissions.
**Admin** -- A person responsible for the maintenance of the system and fixing
problems supervisors cannot solve.
**Exercise** -- An algorithmic problem that can be assigned to a group. They
can be shared by the teachers using an exercise database in ReCodEx.
**Assignment** -- An exercise assigned to a group, possibly with modifications.
**Runtime environment** -- Runtime environment is unique combination of platform
(OS) and programming language runtime/compiler in specific version. Runtime
environments are managed by the administrators to reflect abilities of whole
system.
**Hardware group** -- Hardware group is a set of workers with similar hardware.
Its purpose is to group workers that are likely to run a program using the same
amount of resources. Hardware groups are managed byt the system administrators
who have to keep them up-to-date.
## General Basics
Description of general basics which are the same for all users of ReCodEx web
application follows.
### First Steps in ReCodEx
You can create an account by clicking the "Create account" menu item in the left
sidebar. You can choose between two types of registration methods -- by creating
a local account with a specific password, or pairing your new account with an
existing CAS UK account.
If you decide to create a new local account using the "Create ReCodEx account”
form, you will have to provide your details and choose a password for your
account. Although ReCodEx allows using quite weak passwords, it is wise to use a
bit stronger ones The actual strength is shown in progress bar near the password
field during registration. You will later sign in using your email address as
your username and the password you select.
If you decide to use the CAS UK service, then ReCodEx will verify your CAS
credentials and create a new account based on information stored there (name and
email address). You can change your personal information later on the
"Settings" page.
Regardless of the desired account type, an instance it will belong to must be
selected. The instance will be most likely your university or other organization
you are a member of.
To log in, go to the homepage of ReCodEx and in the left sidebar choose the menu
item "Sign in". Then you must enter your credentials into one of the two forms
-- if you selected a password during registration, then you should sign with
your email and password in the first form called "Sign into ReCodEx". If you
registered using the Charles University Authentication Service (CAS), you should
put your students number and your CAS password into the second form called
"Sign into ReCodEx using CAS UK".
There are several options you can edit in your user account:
- changing your personal information (i.e., name)
- changing your credentials (email and password)
- updating your preferences (source code viewer/editor settings, default
language)
You can access the settings page through the "Settings" button right under
your name in the left sidebar.
If you are not active in ReCodEx for a whole day, you will be logged out
automatically. However, we recommend you sign out of the application after you
finish your interaction with it. The logout button is placed in the top section
of the left sidebar right under your name. You may need to expand the sidebar
with a button next to the "ReCodEx” title (informally known as _hamburger
button_), depending on your screen size.
### Forgotten Password
If you cannot remember your password and you do not use CAS UK authentication,
then you can reset your password. You will find a link saying "Cannot remember
what your password was? Reset your password." under the sign in form. After you
click this link, you will be asked to submit your registration email address. A
message with a link containing a special token will be sent to you by e-mail --
we make sure that the person who requested password resetting is really you.
When you visit the link, you will be able to enter a new password for your
account. The token is valid only for a couple of minutes, so do not forget to
reset the password as soon as possible, or you will have to request a new link
with a valid token.
If you sign in through CAS UK, then please follow the instructions
provided by the administrators of the service described on their
website.
### Dashboard
When you log into the system you should be redirected to your "Dashboard". On
this page you can see some brief information about the groups you are member of.
The information presented there varies with your role in the system -- further
description of dashboard will be provided later on with according roles.
## Student
Student is a default role for every newly registered user. This role has quite
limited capabilites in ReCodEx. Generally, a student can only submit solutions
of exercises in some particular groups. These groups should correspond to
courses he/she attends.
On the "Dashboard" page there is "Groups you are student of" section where you
can find list of your student groups. In first column of every row there is a
brief panel describing concerning group. There is name of the group and
percentage of gained points from course. If you have enough points to
successfully complete the course then this panel has green background with tick
sign. In the second column there is a list of assigned exercises with its
deadlines. If you want to quickly get to the groups page you might want to use
provided "Show group's detail" button.
### Join Group
To be able to submit solutions you have to be a member of the right group. Each
instance has its own group hierarchy, so you can choose only those within your
instance. That is why a list of groups is available from under an instance link
located in the sidebar. This link brings you to instance detail page.
In there you can see a description of the instance and most importantly in
"Groups hierarchy" box there is a hierarchical list of all public groups in the
instance. Please note that groups with plus sign are collapsible and can be
further extended. When you find a group you would like to join, continue by
clicking on "See group's page" link following with "Join group" link.
**Note:** Some groups can be marked as private and these groups are not visible
in hierarchy and membership cannot be established by students themselves.
Management of students in this type of groups is in the hands of supervisors.
On the group detail page there are multiple interesting things for you. The
first one is a brief overview containing the information describing the group,
there is a list of supervisors and also the hierarchy of the subgroups. The most
important section is the "Student's dashboard" section. This section contains
the list of assignments and the list of fellow students. If the supervisors of
the group allowed students to see the statistic of their fellow students then
there will also be the number of points each of the students has gained so far.
### Start Solving Assignments
In the "Assignments" box on the group detail page there is a list of assigned
exercises which students are supposed to solve. The assignments are displayed
with their names and deadlines. There are possibly two deadlines, the first one
means that till this datetime student will receive full amount of points in case
of successful solution. Second deadline does not have to be set, but in case it
is, the maximum number of points for successful solution between these two
deadlines can be different.
An assignment link will lead you to assignment detail page where are presented
all known details about assignment. There are of course both deadlines, limit of
submissions which you can make and also full-range description of assignment,
which can be localized. The localization can be on demand switched between all
language variants in tab like box.
Further on the page you can find "Submitted solutions" box where is a list of
submissions with links to result details. But most importantly there is a
"Submit new solution" button on the assignment page which provides an interface
to submit solution of the assignment.
After clicking on submit button, dialog window will show up. In here you can
upload files representing your solution, you can even add some notes to mark the
solution. Your supervisor can also access this note. After you successfully
upload all files necessary for your solution, click the "Submit your solution"
button and let ReCodEx evaluate the solution.
During the execution ReCodEx backend might send evaluation progress state to
your browser which will be displayed in another dialog window. When the whole
execution is finished then a "See the results" button will appear and you can
look at the results of your solution.
### View Results of Submission
On the results detail page there are a lot of information. Apart from assignment
description, which is not connected to your results, there is also the solution
submitter name (supervisor can submit a solution on your behalf), further there
are files which were uploaded on submission and most importantly "Evaluation
details" and "Test results" boxes.
Evaluation details contains overall results of your solution. There are
information such as whether the solution was provided before deadlines, if the
evaluation process successfully finished or if compilation succeeded. After that
you can find a lot of values, most important one is the last, "Total score",
consisting of your score, slash and the maximum number of points for this
assignment. Interestingly the your score value can be higher than the maximum,
which is caused by "Bonus points" item above. If your solution is nice and
supervisor notices it, he/she can assign you additional points for effort. On
the other hand, points can be also subtracted for bad coding habits or even
cheating.
In test results box there is a table of all exercise tests results. Columns
represents these information:
- test case overall result, symbol of yes/no option
- test case name
- percentage of correctness of this particular test
- evaluation status, if test was successfully executed or failed
- memory limit, if supervisor allowed it then percentual memory usage is
displayed
- time limit, if supervisor allowed it then percentual time usage is displayed
A new feature of web application is "Comments and notes" box where you can
communicate with your supervisors or just write random private notes to your
submission. Adding a note is quite simple, you just write it to text field in
the bottom of box and click on the "Send" button. The button with lock image
underneath can switch visibility of newly created comments.
In case you think the ReCodEx evaluation of your solution is wrong, please use
the comments system described above, or even better notify your supervisor by
another channel (email). Unfortunately there is currently no notification
mechanism for new comment messages.
## Group Supervisor
Group supervisor is typically the lecturer of a course. A user in this role can
modify group description and properties, assign exercises or manage list of
students. Further permissions like managing subgroups or supervisors is
available only for group administrators.
On "Dashboard" page you can find "Groups you supervise" section. Here there are
boxes representing your groups with the list of students attending course and
their points. Student names are clickable with redirection to the profile of the
user where further information about his/hers assignments and solution can be
found. To quickly jump onto groups page, use "Show group's detail" button at
the bottom of the matching group box.
### Manage Group
Locate group you supervise and you want to manage. All your supervised groups
are available in sidebar under "Groups -- supervisor" collapsible menu. If you
click on one of those you will be redirected to group detail page. In addition
to basic group information you can also see "Supervisor's controls" section. In
this section there are lists of current students and assignments.
As a supervisor of group you are able to see "Edit group settings" button
at the top of the page. Following this link will take you to group editation
page with form containing these fields:
- group name which is visible to other users
- external identification which may be used for pairing with entries in an
information system
- description of group which will be available to users in instance (in
Markdown)
- set if group is publicly visible (and joinable by students) or private
- options to set if students should be able see statistics of each other
- minimal points threshold which students have to gain to successfully complete
the course
After filling all necessary fields the form can be sent by clicking on "Edit
group" button and all changes will be applied.
For students management there are "Students" and "Add student" boxes. The first
one is simple list of all students which are attending the course with the
possibility of delete them from the group. That can be done by hitting "Leave
group" button near particular user. The second box is for adding students to the
group. There is a text field for typing name of the student and after clicking
on the magnifier image or pressing enter key there will appear list of matched
users. At this moment just click on the "Join group" button and student will be
signed in to your group.
### Assigning Exercises
Before assigning an exercise, you obviously have to know what exercises are
available. A list of all exercises in the system can be found under "Exercises"
link in sidebar. This page contains a table with exercises names, difficulties
and names of the exercise authors. Further information about exercise is
available by clicking on its name.
On the exercise details page are numerous information about it. There is a box
with all possible localized descriptions and also a box with some additional
information of exercise author, its difficulty, version, etc. There is also a
description for supervisors by exercise author under "Exercise overview" option,
where some important information can be found. And most notably there is an
information about available programming languages for this exercise, under
"Supported runtime environments" section.
If you decide that the exercise is suitable for one of your groups, look for the
"Groups" box at the bottom of the page. There is a list of all groups you
supervise with an "Assign" button which will assign the exercise to the
selected group.
After clicking on the "Assign" button you should be redirected to assignment
editation page. In there you can find two forms, one for editation of assignment
meta information and the second one for setting exercise time and memory limits.
In meta information form you can fill these options:
- name of the assignment which will be visible in a group
- visibility (if an assignment is under construction then you can mark it as not
visible and students will not see it)
- subform for localized descriptions (new localization can be added by clicking
on "Add language variant" button, current one can be deleted with "Remove this
language" button)
- language of description from dropdown field (English, Czech, German)
- description in selected language
- score configuration which will be used on students solution evaluation, you
can find some very simple one already in here, description of score
configuration can be found further in "Writing score configuration" chapter
- first submission deadline
- maximum points that can be gained before the first deadline; if you want to
manage all points manually, set it to 0 and then use bonus points, which are
described in the next subchapter
- second submission deadline, after that students still can submit exercises but
they are given no points no points (must be after the first deadline)
- maximum points that can be gained between first deadline and second deadline
- submission count limit for the solutions of the students -- this limits the amount of
attempts a student has at solving the problem
- visibility of memory and time ratios; if true students can see the percentage
of used memory and time (with respect to the limit) for each test
- minimum percentage of points which each submission must gain to be considered
correct (if it gets less, it will gain no points)
- whether the assignment is marked as bonus one and points from solving it are
not included into group threshold limit (that means solving it can get you
additional points over the limit)
The form has to be submitted with "Edit settings" button otherwise changes will
not be saved.
The same editation page serves also for the purpose of assignment editation, not
only creation. That is why on bottom of the page "Delete the assignment" box
can be found. Clearly the button "Delete" in there can be used to unassign
exercise from group.
The last unexplored area is the time and memory limits form. The whole form is
situated in a box with tabs which are leading to particular runtime
environments. If you wish not to use one of those, locate "Remove" button at the
bottom of the box tab which will delete this environment from the assignment.
Please note that this action is irreversible.
In general, every tab in environments box contains some basic information about
runtime environment and another nested tabbed box. In there you can find all
hardware groups which are available for the exercise and set limits for all test
cases. The time limits have to be filled in seconds (float), memory limits are
in bytes (int). If you are interested in some reference values to particular
test case then you can take a peek on collapsible "Reference solutions'
evaluations" items. If you are satisfied with changes you made to the limits,
save the form with "Change limits" button right under environments box.
### Management of The Solutions of The Students
One of the most important tasks of a group supervisor is checking student
solutions. As automatic evaluation of them cannot catch all problems in the
source code, it is advisable to do a brief manual review of the coding
style of the student and reflect that in assignment bonus points.
On "Assignment detail" page there is an "View student results" button near top
of the page (next to "Edit assignment settings" button). This will redirect you
to a page where is a list of boxes, one box per student. Each student box
contains a list of submissions for this assignment. The row structure of
submission list is the same as the structure in the "Submitted solution"
box. More information about every solution can be shown by clicking on "Show
details" link on the end of solution row.
This page is the same as for students with one exception -- there is an
additional collapsed box "Set bonus points". In unfolded state, there is an
input field for one number (positive or negative integer) and confirmation
button "Set bonus points". After filling intended amount of points and
submitting the form, the data in "Evaluation details" box get immediately
updated. To remove assigned bonus points, submit just the zero number. The bonus
points are not additive, newer value overrides older values.
It is useful to give a feedback about the solution back to the user. For this
you can use the "Commens and notes" box. Make sure that the messages are not
private, so that the student can see them. More detailed description of this box
can be nicely used the "Comments and notes" box. Make sure that the messages are
not private, so the student can see them. More detailed description of this box
is available in student part of user documentation.
One of the discussed concept was marking one solution as accepted. However, due
to lack of frontend developers it is not yet prepared in user interface. We
hope, it will be ready as soon as possible. The button for accepting a solution
will be most probably also on this page.
### Creating Exercises
Link to exercise creation can be found in exercises list which is accessible
through "Exercises" link in sidebar. On the bottom of the exercises list page
you can find "Add exercise" button which will redirect you to exercise editation
page. In this moment exercise is already created so if you just leave this page
exercise will stay in the database. This is also reason why exercise creation
form is the same as the exercise editation form.
Exercise editation page is divided into three separate forms. First one is
supposed to contain meta information about exercise, second one is used for
uploading and management of supplementary files and third one manages runtime
configuration in which exercise can be executed.
First form is located in "Edit exercise settings" and generally contains meta
information needed by frontend which are somehow somewhere visible. In here you
can define:
- exercise name which will be visible to other supervisors
- difficulty of exercise (easy, medium, hard)
- description which will be available only for visitors, may be used for further
description of exercise (for example information about test cases and how they
could be scored)
- private/public switch, if exercise is private then only you as author can see
it, assign it or modify it
- subform containing localized descriptions of exercise, new one can be added
with "Add language variant" button and current one deleted with "Remove this
language"
- language in which this particular description is in (Czech, English,
German)
- actual localized description of exercise
After all information is properly set form has to be submitted with "Edit
settings" button.
Management of supplementary files can be found in "Supplementary files" box.
Supplementary files are files which you can use further in job configurations
which have to be provided in all runtime configurations. These files are
uploaded directly to fileserver from where worker can download them and use
during execution according to job configuration.
Files can be uploaded either by drag and drop mechanism or by standard "Add a
file" button. In opened dialog window choose file which should be uploaded. All
chosen files are immediately uploaded to server but to save supplementary files
list you have to hit "Save supplementary files" button. All previously uploaded
files are visible right under drag and drop area, please note that files are
stored on fileserver and cannot be deleted after upload.
The last form on exercise editation page is runtime configurations editation
form. Exercise can have multiple runtime configurations according to the number
of programming languages in which it can be run. Every runtime configuration
corresponds to one programming language because all of them has to have a bit
different job configuration.
New runtime configuration can be added with "Add new runtime configuration"
button this will spawn new tab in runtime configurations box. In here you can
fill following:
- human readable identifier of runtime configuration
- runtime environment which corresponds to programming language
- job configuration in YAML, detailed description of job configuration can be
found further in this chapter in "Writing job configuration" section
If you are done with changes to runtime configurations save form with "Change
runtime configurations" button. If you want to delete some particular runtime
just hit "Remove" button in the right tab, please note that after this operation
runtime configurations form has to be again saved to apply changes.
All runtime configurations which were added to exercise will be visible to
supervisors and all can be used in assignment, so please be sure that all of the
languages and job configurations are working.
If you choose to delete exercise, at the bottom of the exercise editation page
you can find "Delete the exercise" box where "Delete" button is located. By
clicking on it exercise will be delete from the exercises list and will no
longer be available.
### Reference Solutions of An Exercise
Each exercise should have a set of reference solutions, which are used to tune
time and memory limits of assignments. Values of used time and memory for each
solution are displayed in yellow boxes under forms for setting assignment limits
as described earlier.
However, there is currently no user interface to upload and evaluate reference
solutions. It is possible to use direct REST API calls, but it is not much user
friendly. If you are interested, please look at [API
documentation](https://recodex.github.io/api/), notably sections
_Uploaded-Files_ and _Reference-Exercise-Solutions_. You need to upload the
reference solution files, create a new reference solution and then evaluate the
solution. After that, measured data will be available in the box at assignment
editing page (setting limits section).
We are now working on a better user interface, which will be available soon.
Then its description will be added here.
## Group Administrator
Group administrator is the group supervisor with some additional permissions in
particular group. Namely group administrator is capable of creating a subgroups
in managed group and also adding and deleting supervisors. Administrator of the
particular group can be only one person.
### Creating Subgroups And Managing Supervisors
There is no special link which will get you to groups in which you are
administrator. So you have to get there through "Groups - supervisor" link in
sidebar and choose the right group detail page. If you are there you can see
"Administrator controls" section, here you can either add supervisor to group or
create new subgroup.
Form for creating a subgroup is present right on the group detail page in "Add
subgroup" box. Group can be created with following options:
- name which will be visible in group hierarchy
- external identification, can be for instance ID of group from school system
- some brief description about group
- allow or deny users to see each others statistics from assignments
After filling all the information a group can be created by clicking on "Create
new group" button. If creation is successful then the group is visible in
"Groups hierarchy" box on the top of page. All information filled during
creation can be later modified.
Adding a supervisor to a group is rather easy, on group detail page is an "Add
supervisor" box which contains text field. In there you can type name or
username of any user from system. After filling user name, click on the
magnifier image or press the enter key and all suitable users are searched. If
your chosen supervisor is in the updated list then just click on the "Make
supervisor" button and new supervisor should be successfully set.
Also, existing supervisor can be removed from the group. On the group detail
page there is "Supervisors" box in which all supervisors of the group are
visible. If you are the group administrator, you can see there "Remove
supervisor" buttons right next to supervisors names. After clicking on it some
particular supervisor should not to be supervisor of the group anymore.
## Instance Administrator
Instance administrator can be only one person per instance. In addition to
previous roles this administrator should be able to modify the instance details,
manage licences and take care of top level groups which belong to the instance.
### Instance Management
List of all instances in the system can be found under "Instances" link in the
sidebar. On that page there is a table of instances with their respective
admins. If you are one of them, you can visit its page by clicking on the
instance name. On the instance details page you can find a description of the
instance, current groups hierarchy and a form for creating a new group.
If you want to change some of the instance settings, follow "Edit instance" link
on the instance details page. This will take you to the instance editation page
with corresponding form. In there you can fill following information:
- name of the instance which will be visible to every other user
- brief description of instance and for whom it is intended
- checkbox if instance is open or not which means public or private (hidden from
potential users)
If you are done with your editation, save filled information by clicking on
"Update instance" button.
If you go back to the instance details page you can find there a "Create new
group" box which is able to add a group to the instance. This form is the same
as the one for creating subgroup in already existing group so we can skip
description of the form fields. After successful creation of the group it will
appear in "Groups hierarchy" box at the top of the page.
### Licences
On the instance details page, there is a box "Licences". On the first line, it
shows it this instance has currently valid licence or not. Then, there are
multiple lines with all licences assigned to this instance. Each line consists
of a note, validity status (if it is valid or revoked by superadministrator) and
the last date of licence validity.
A box "Add new licence" is used for creating new licences. Required fields are
the note and the last day of validity. It is not possible to extend licence
lifetime, a new one should be generated instead. It is possible to have more
than one valid licence at a time. Currently there is no user interface for
revoking licences, this is done manually by superadministrator. If an instance
is to be disabled, all valid licences have to be revoked.
## Superadministrator
Superadministrator is a user with the most privileges and as such superadmin
should be quite a unique role. Ideally, there should be only one user of this
kind, used with special caution and adequate security. With this stated it is
obvious that superadmin can perform any action the API is capable of.
### Management of Users
There are only a few user roles in ReCodEx. Basically there are only three:
_student_, _supervisor_, and _superadmin_. Base role is student which is
assigned to every registered user. Roles are stored in database alongside other
information about user. One user always has only one role at the time. At first
startup of ReCodEx, the administrator has to change the role for his/her account
manually in the database. After that manual intervention into database should
never be needed.
There is a little catch in groups and instances management. Groups can have
admins and supervisors. This setting is valid only per one particular group and
has to be separated from basic role system. This implies that supervisor in one
group can be student in another and simultaneously have global supervisor role.
Changing role from student to supervisor and back is done automatically when the
new privileges are granted to the user, so managing roles by hand in database is
not needed. Previously stated information can be applied to instances as well,
but instances can only have admins.
Roles description:
- Student -- Default role which is used for newly created accounts. Student can
join or leave public groups and submit solutions of assigned exercises.
- Supervisor -- Inherits all permissions from student role. Can manage groups to
which he/she belongs to. Supervisor can also view and change groups details,
manage assigned exercises, view students in group and their solutions for
assigned exercises. On top of that supervisor can create/delete groups too,
but only as subgroup of groups he/she belongs to.
- Superadmin -- Inherits all permissions from supervisor role. Most powerful
user in ReCodEx who should be able to do access any functionality provided by
the application.
## Writing Score Configuration
An important thing about assignment is how to assign points to particular
solutions. As mentioned previously, the whole job is composed of logical tests.
All of these tests have to contain one essential "evaluation" task. Evaluation
task should output one float number which can be further used for scoring of
particular tests.
Total resulting score of the students solution is then calculated according to a
supplied score config (described below) and using specified calculator. Total
score is also a float between 0 and 1. This number is then multiplied by the
maximum of points awarded for the assignment by the teacher assigning the
exercise -- not the exercise author.
For now, there is only one way how to write score configuration using only
simple score calculator. But the implementation in API is agile enough to handle
upcoming score calculators which might use some more complex scoring algorithms.
This also means that future calculators do not have to use the YAML format for
configuration. In fact, the configuration can be a string in any format.
### Simple Score Calculation
First implemented calculator is simple score calculator with test weights. This
calculator just looks at the score of each test and put them together according
to the test weights specified in assignment configuration. Resulting score is
calculated as a sum of products of score and weight of each test divided by the
sum of all weights. The algorithm in Python would look something like this:
```
sum = 0
weightSum = 0
for t in tests:
sum += t.score * t.weight
weightSum += t.weight
score = sum / weightSum
```
Sample score config in YAML format:
```{.yml}
testWeights:
a: 300 # test with id 'a' has a weight of 300
b: 200
c: 100
d: 100
```
## Writing Job Configuration
To run and evaluate an exercise the backend needs to know the steps how to do
that. This is different for each environment (operation system, programming
language, etc.), so each of the environments needs to have separate
configuration.
Backend works with a powerful, but quite low level description of simple
connected tasks written in YAML syntax. More about the syntax and general task
overview can be found on [separate
page](https://github.com/ReCodEx/wiki/wiki/Assignments). One of the planned
features was user friendly configuration editor, but due to tight deadline and
team composition it did not make it to the first release. However, writing
configuration in the basic format will be always available and allows users to
use the full expressive power of the system.
This section walks through creation of job configuration for _hello world_
exercise. The goal is to compile file _source.c_ and check if it prints `Hello
World!` to the standard output. This is the only test case named **A**.
The problem can be split into several tasks:
- compile _source.c_ into _helloworld_ with `/usr/bin/gcc`
- run _helloworld_ and save standard output into _out.txt_
- fetch predefined output (suppose it is already uploaded to fileserver) with
hash `a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b` to _reference.txt_
- compare _out.txt_ and _reference.txt_ by `/usr/bin/diff`
The absolute path of tools can be obtained from system administrator. However,
`/usr/bin/gcc` is location, where the GCC binary is available almost everywhere,
so location of some tools can be (professionally) guessed.
First, write header of the job to the configuration file.
```{.yml}
submission:
job-id: hello-word-job
hw-groups:
- group1
```
Basically it means, that the job _hello-world-job_ needs to be run on workers
that belong to the `group_1` hardware group . Reference files are downloaded
from the default location configured in API (such as
`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job
execution log will not be saved to result archive.
Next the tasks have to be constructed under _tasks_ section. In this demo job,
every task depends only on previous one. The first task has input file
_source.c_ (if submitted by user) already available in working directory, so
just call the GCC. Compilation is run in sandbox as any other external program
and should have relaxed time and memory limits. In this scenario, worker
defaults are used. If compilation fails, the whole job is immediately terminated
(because the _fatal-failure_ bit is set). Because _bound-directories_ option in
sandbox limits section is mostly shared between all tasks, it can be set in
worker configuration instead of job configuration (suppose this for following
tasks). For configuration of workers please contact your administrator.
```{.yml}
- task-id: "compilation"
type: "initiation"
fatal-failure: true
cmd:
bin: "/usr/bin/gcc"
args:
- "source.c"
- "-o"
- "helloworld"
sandbox:
name: "isolate"
limits:
- hw-group-id: group1
chdir: ${EVAL_DIR}
bound-directories:
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
```
The compiled program is executed with time and memory limit set and the standard
output is redirected to a file. This task depends on _compilation_ task, because
the program cannot be executed without being compiled first. It is important to
mark this task with _execution_ type, so exceeded limits will be reported in
frontend.
Time and memory limits set directly for a task have higher priority than worker
defaults. One important constraint is, that these limits cannot exceed limits
set by workers. Worker defaults are present as a safety measure so that a
malformed job configuration cannot block the worker forever. Worker default
limits should be reasonably high, like a gigabyte of memory and several hours of
execution time. For exact numbers please contact your administrator.
It is important to know that if the output of a program (both standard and
error) is redirected to a file, the sandbox disk quotas apply to that file, as
well as the files created directly by the program. In case the outputs are
ignored, they are redirected to `/dev/null`, which means there is no limit on
the output length (as long as the printing fits in the time limit).
```{.yml}
- task-id: "execution_1"
test-id: "A"
type: "execution"
dependencies:
- compilation
cmd:
bin: "helloworld"
sandbox:
name: "isolate"
stdout: ${EVAL_DIR}/out.txt
limits:
- hw-group-id: group1
chdir: ${EVAL_DIR}
bound-directories:
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
time: 0.5
memory: 8192
```
Fetch sample solution from file server. Base URL of file server is in the header
of the job configuration, so only the name of required file (its `sha1sum` in
our case) is necessary.
```{.yml}
- task-id: "fetch_solution_1"
test-id: "A"
dependencies:
- execution
cmd:
bin: "fetch"
args:
- "a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"
- "${SOURCE_DIR}/reference.txt"
```
Comparison of results is quite straightforward. It is important to set the task
type to _evaluation_, so that the return code is set to 0 if the program is
correct and 1 otherwise. We do not set our own limits, so the default limits are
used.
```{.yml}
- task-id: "judge_1"
test-id: "A"
type: "evaluation"
dependencies:
- fetch_solution_1
cmd:
bin: "/usr/bin/diff"
args:
- "out.txt"
- "reference.txt"
sandbox:
name: "isolate"
limits:
- hw-group-id: group1
chdir: ${EVAL_DIR}
bound-directories:
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
```
# Implementation
## Broker
The broker is a central part of the ReCodEx backend that directs most of the
communication. It was designed to maintain a heavy load of messages by making
only small actions in the main communication thread and asynchronous execution
of other actions.
The responsibilites of broker are:
- allowing workers to register themselves and keep track of their capabilities
- tracking status of each worker and handle cases when they crash
- accepting assignment evaluation requests from the frontend and forwarding them
to workers
- receiving a job status information from workers and forward them to the
frontend either via monitor or REST API
- notifying the frontend on errors of the backend
### Internal Structure
The main work of the broker is to handle incomming messages. For that a
_reactor_ subcomponent is written to bind events on sockets to handler classes.
There are currently two handlers -- one that handles the main functionality and
the other that sends status reports to the REST API asynchronously. This
prevents broker freezes when synchronously waiting for responses of HTTP
requests, especially when some kind of error happens on the server.
Main handler takes care of requests from workers and API servers:
- *init* -- initial connection from worker to broker
- *done* -- currently processed job on worker was executed and is done
- *ping* -- worker prooving that it is still alive
- *progress* -- job progress state from worker which is immediatelly forwarded
to monitor
- *eval* -- request from API server to execute given job
Second handler is asynchronous status notifier which is able to execute HTTP
requests. This notifier is used on error reporting from backend to frontend API.
#### Worker Registry
The `worker_registry` class is used to store information about workers, their
status and the jobs in their queue. It can look up a worker using the headers
received with a request (a worker is considered suitable if and only if it
satisfies all the job headers). The headers are arbitrary key-value pairs, which
are checked for equality by the broker. However, some headers require special
handling, namely `threads`, for which we check if the value in the request is
lesser than or equal to the value advertised by the worker, and `hwgroup`, for
which we support requesting one of multiple hardware groups by listing multiple
names separated with a `|` symbol (e.g. `group_1|group_2|group_3`.
The registry also implements a basic load balancing algorithm -- the workers are
contained in a queue and whenever one of them receives a job, it is moved to its
end, which makes it less likely to receive another job soon.
When a worker is assigned a job, it will not be assigned another one until a
`done` message is received.
#### Error Reporting
Broker is the only backend component which is able to report errors directly to
the REST API. Other components have to notify the broker first and it forwards
the messages to the API. For HTTP communication a *libcurl* library is used. To
address security concerns there is a *HTTP Basic Auth* configured on particular
API endpoints and correct credentials have to be entered.
Following types of failures are distinguished:
**Job failure** -- there are two ways a job can fail, internal and external one.
An internal failure is the fault of worker, for example when it
cannot download a file needed for the evaluation. An external
error is for example when the job configuration is malformed. Note that wrong
student solution is not considered as a job failure.
Jobs that failed internally are reassigned until a limit on the amount of
reassingments (configurable with the `max_request_failures` option) is reached.
External failures are reported to the frontend immediately.
**Worker failure** -- when a worker crash is detected, an attempt to reassign
its current job and also all the jobs from its queue is made. Because the
current job might be the reason of the crash, its reassignment is also counted
towards the `max_request_failures` limit (the counter is shared). If there is
no worker that could process a job available (i.e. it cannot be reassigned),
the job is reported as failed to the frontend via REST API.
**Broker failure** -- when the broker itself crashed and is restarted, workers
will reconnect automatically. However, all jobs in their queues are lost. If a
worker manages to finish a job and notifies the "new" broker, the report is
forwarded to the frontend. The same goes for external failures. Jobs that fail
internally cannot be reassigned, because the "new" broker does not know their
headers -- they are reported as failed immediately.
### Additional Libraries
Broker implementation depends on several open-source C and C++ libraries.
- **libcurl** -- Libcurl is used for notifying REST API on job finish event over
HTTP protocol. Due to lack of documentation of all C++ bindings the plain C
API is used.
- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
basicaly contains only one header file, but its API fits into the object
architecture of the broker.
- **spdlog** -- Spdlog is small, fast and modern logging library used for system
logging. It is highly customizable and configurable from the
configuration of the broker.
- **yaml-cpp** -- Yaml-cpp is used for parsing borker configuration text file in
YAML format.
- **boost-filesystem** -- Boost filesystem is used for managing logging
directory (create if necessary) and parsing filesystem paths from strings as
written in the configuration of the broker. Filesystem operations will be
included in future releases of C++ standard, so this dependency may be
removed in the future.
- **boost-program_options** -- Boost program options is used for
parsing of command line positional arguments. It is possible to use POSIX
`getopt` C function, but we decided to use boost, which provides nicer API and
is already used by worker component.
## Fileserver
The fileserver component provides a shared file storage between the frontend and
the backend. It is writtend in Python 3 using Flask web framework. Fileserver
stores files in configurable filesystem directory, provides file deduplication
and HTTP access. To keep the stored data safe, the fileserver should not be
visible from public internet. Instead, it should be accessed indirectly through
the REST API.
### File Deduplication
From our analysis of the requirements, it is certain we need to implement a
means of dealing with duplicate files.
File deduplication is implemented by storing files under the hashes of their
content. This procedure is done completely inside fileserver. Plain files are
uploaded into fileserver, hashed, saved and the new filename is returned back to
the uploader.
SHA1 is used as hashing function, because it is fast to compute and provides
reasonable collision safety for non-cryptographic purposes. Files with the same
hash are treated as the same, no additional checks for collisions are performed.
However, it is really unlikely to find one. If SHA1 proves insufficient, it is
possible to change the hash function to something else, because the naming
strategy is fully contained in the fileserver (special care must be taken to
maintain backward compatibility).
### Storage Structure
Fileserver stores its data in following structure:
- `./submissions/<id>/` -- folder that contains files submitted by users (the
solutions to the assignments of the student). `<id>` is an identifier received
from the REST API.
- `./submission_archives/<id>.zip` -- ZIP archives of all submissions. These are
created automatically when a submission is uploaded. `<id>` is an identifier
of the corresponding submission.
- `./exercises/<subkey>/<key>` -- supplementary exercise files (e.g. test inputs
and outputs). `<key>` is a hash of the file content (`sha1` is used) and
`<subkey>` is its first letter (this is an attempt to prevent creating a flat
directory structure).
- `./results/<id>.zip` -- ZIP archives of results for submission with `<id>`
identifier.
## Worker
The job of the worker is to securely execute a job according to its
configuration and upload results back for latter processing. After receiving an
evaluation request, worker has to do following:
- download the archive containing submitted source files and configuration file
- download any supplementary files based on the configuration file, such as test
inputs or helper programs (this is done on demand, using a `fetch` command
in the assignment configuration)
- evaluate the submission according to job configuration
- during evaluation progress messages can be sent back to broker
- upload the results of the evaluation to the fileserver
- notify broker that the evaluation finished
### Internal Structure
Worker is logicaly divided into two parts:
- **Listener** -- communicates with broker through ZeroMQ. On startup, it
introduces itself to the broker. Then it receives new jobs, passes them to
the evaluator part and sends back results and progress reports.
- **Evaluator** -- gets jobs from the listener part, evaluates them (possibly in
sandbox) and notifies the other part when the evaluation ends. Evaluator also
communicates with fileserver, downloads supplementary files and uploads
detailed results.
These parts run in separate threads of the same process and communicate through
ZeroMQ in-process sockets. Alternative approach would be using shared memory
region with unique access, but messaging is generally considered safer. Shared
memory has to be used very carefully because of race condition issues when
reading and writing concurrently. Also, messages inside worker are small, so
there is no big overhead copying data between threads. This multi-threaded
design allows the worker to keep sending `ping` messages even when it is
processing a job.
### Capability Identification
There are possibly multiple worker instances in a ReCodEx installation and each
one can run on different hardware, operating system, or have different tools
installed. To identify the hardware capabilities of a worker, we use the concept
of **hardware groups**. Each worker belongs to exactly one group that specifies
the hardware and operating system on which the submitted programs will be run. A
worker also has a set of additional properties called **headers**. Together they
help the broker to decide which worker is suitable for processing a job
evaluation request. This information is sent to the broker on worker startup.
The hardware group is a string identifier of the hardware configuration, for
example "i7-4560-quad-ssd-linux" configured by the administrator for each worker
instance. If this is done correctly, performance measurements of a submission
should yield the same results on all computers from the same hardware group.
Thanks to this fact, we can use the same resource limits on every worker in a
hardware group.
The headers are a set of key-value pairs that describe the worker capabilities.
For example, they can show which runtime environments are installed or whether
this worker measures time precisely. Headers are also configured manually by an
administrator.
### Running Student Submissions
Student submissions are executed in a sandbox environment to prevent them from
damaging the host system and also to restrict the amount of used resources.
Currently, only the Isolate sandbox support is implemented, but it is possible
to add support for another sandox.
Every sandbox, regardless of the concrete implementation, has to be a command
line application taking parameters with arguments, standard input or file.
Outputs should be written to a file or to the standard output. There are no
other requirements, the design of the worker is very versatile and can be
adapted to different needs.
The sandbox part of the worker is the only one which is not portable, so
conditional compilation is used to include only supported parts of the project.
Isolate does not work on Windows environment, so also its invocation is done
through native calls of Linux OS (`fork`, `exec`). To disable compilation of
this part on Windows, the `#ifndef _WIN32` guard is used around affected files.
Isolate in particular is executed in a separate Linux process created by `fork`
and `exec` system calls. Communication between processes is performed through an
unnamed pipe with standard input and output descriptors redirection. To prevent
Isolate failure there is another safety guard -- whole sandbox is killed when it
does not end in `(time + 300) * 1.2` seconds where `time` is the original
maximum time allowed for the task. This formula works well both for short and
long tasks, but the timeout should never be reached if Isolate works properly --
it should always end itself in time.
### Directories and Files
During a job execution the worker has to handle several files -- input archive
with submitted sources and job configuration, temporary files generated during
execution or fetched testing inputs and outputs. For each job is created a
separate directory structure which is removed after finishing the job.
The files are stored in local filesystem of the worker computer in a
configurable location. The job is not restricted to use only specified
directories (tasks can do anything that is allowed by the system), but it is
advised not to write outside them. In addition, sandboxed tasks are usually
restricted to use only a specific (evaluation) directory.
The following directory structure is used for execution. The working directory
of the worker (root of the following paths) is shared for multiple instances on
the same computer.
- `downloads/${WORKER_ID}/${JOB_ID}` -- place to store the downloaded archive
with submitted sources and job configuration
- `submission/${WORKER_ID}/${JOB_ID}` -- place to store a decompressed
submission archive
- `eval/${WORKER_ID}/${JOB_ID}` -- place where all the execution should happen
- `temp/${WORKER_ID}/${JOB_ID}` -- place for temporary files
- `results/${WORKER_ID}/${JOB_ID}` -- place to store all files which will be
uploaded on the fileserver, usually only yaml result file and optionally log
file, other files have to be explicitly copied here if requested
Some of the directories are accessible during job execution from within sandbox
through predefined variables. List of these is described in job configuration
appendix.
### Judges
ReCodEx provides a few initial judges programs. They are mostly adopted from
CodEx and installed automatically with the worker component. Judging programs
have to meet some requirements. Basic ones are inspired by standard `diff`
application -- two mandatory positional parameters which have to be the files
for comparison and exit code reflecting if the result is correct (0) of wrong
(1).
This interface lacks support for returning additional data by the judges, for
example similarity of the two files calculated as the Levenshtein edit distance.
To allow passing these additional values an extended judge interface can be
implemented:
- Parameters: There are two mandatory positional parameters which have to be
files for comparision
- Results:
- _comparison OK_
- exitcode: 0
- stdout: there is a single line with a double value which
should be 1.0
- _comparison BAD_
- exitcode: 1
- stdout: there is a single line with a double value which
should be quality percentage of the judged file
- _error during execution_
- exitcode: 2
- stderr: there should be description of error
The additional double value is saved to the results file and can be used for
score calculation in the frontend. If just the basic judge is used, the values
are 1.0 for exit code 0 and 0.0 for exit code 1.
If more values are needed for score computation, multiple judges can be used in
sequence and the values used together. However, extended judge interface should
comply most of possible use cases.
### Additional Libraries
Worker implementation depends on several open-source C and C++ libraries. All of
them are multi-platform, so both Linux and Windows builds are possible.
- **libcurl** -- Libcurl is used for all HTTP communication, that is downloading
and uploading files. Due to lack of documentation of all C++ bindings the
plain C API is used.
- **libarchive** -- Libarchive is ised for compressing and extracting archives.
Actual supported formats depends on installed packages on target system, but
at least ZIP and TAR.GZ should be available.
- **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It
basicaly contains only one header file, but its API fits into the object
architecture of the worker.
- **spdlog** -- Spdlog is small, fast and modern logging library. It is used for
all of the logging, both system and job logs. It is highly customizable and
configurable from the configuration of the worker.
- **yaml-cpp** -- Yaml-cpp is used for parsing and creating text files in YAML
format. That includes the configuration of the worker, the configuration and
the results of a job.
- **boost-filesystem** -- Boost filesystem is used for multi-platform
manipulation with files and directories. However, these operations will be
included in future releases of C++ standard, so this dependency may be removed
in the future.
- **boost-program_options** -- Boost program options is used for multi-platform
parsing of command line positional arguments. It is not necessary to use it,
similar functionality can be implemented be ourselves, but this well known
library is effortless to use.
## Monitor
Monitor is an optional part of the ReCodEx solution for reporting progress of
job evaluation back to users in the real time. It is written in Python, tested
versions are 3.4 and 3.5. Following dependencies are used:
- **zmq** -- binding to ZeroMQ message framework
- **websockets** -- framework for communication over WebSockets
- **asyncio** -- library for fast asynchronous operations
- **pyyaml** -- parsing YAML configuration files
There is just one monitor instance required per broker. Also, monitor has to be
publicly visible (has to have public IP address or be behind public proxy
server) and also needs a connection to the broker. If the web application is
using HTTPS, it is required to use a proxy for monitor to provide encryption
over WebSockets. If this is not done, browsers of the users will block
unencrypted connection and will not show the progress to the users.
### Message Flow
![Message flow inside monitor](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all
components (logger for example), starts the other thread and runs the ZeroMQ
part of the application. This thread receives and parses incomming messages from
broker and forwards them to _thread 2_ sending logic.
_Thread 2_ is responsible for managing all of WebSocket connections
asynchronously. Whole thread is one big _asyncio_ event loop through which all
actions are processed. None of custom data types in Python are thread-safe, so
all events from other threads (actually only `send_message` method invocation)
must be called within the event loop (via `asyncio.loop.call_soon_threadsafe`
function). Please note, that most of the Python interpreters use [Global
Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there
is actualy no parallelism in the performance point of view, but proper
synchronization is still required.
### Handling of Incomming Messages
Incomming ZeroMQ progress message is received and parsed to JSON format (same as
our WebSocket communication format). JSON string is then passed to _thread 2_
for asynchronous sending. Each message has an identifier of channel where to
send it to.
There can be multiple receivers to one channel id. Each one has separate
_asyncio.Queue_ instance where new messages are added. In addition to that,
there is one list of all messages per channel. If a client connects a bit later
than the point when monitor starts to receive messages, it will receive all
messages from the beginning. Messages are stored 5 minutes after last progress
command (normally FINISHED) is received, then are permanently deleted. This
caching mechanism was implemented because early testing shows, that first couple
of messages are missed quite often.
Messages from the queue of the client are sent through corresponding WebSocket
connection via main event loop as soon as possible. This approach with separate
queue per connection is easy to implement and guarantees reliability and order
of message delivery.
## Cleaner
Cleaner component is tightly bound to the worker. It manages the cache folder of
the worker, mainly deletes outdated files. Every cleaner instance maintains one
cache folder, which can be used by multiple workers. This means on one server
there can be numerous instances of workers with the same cache folder, but there
should be only one cleaner instance.
Cleaner is written in Python 3 programming language, so it works well
multi-platform. It uses only `pyyaml` library for reading configuration file and
`argparse` library for processing command line arguments.
It is a simple script which checks the cache folder, possibly deletes old files
and then ends. This means that the cleaner has to be run repeatedly, for example
using cron, systemd timer or Windows task scheduler. For proper function of the
cleaner a suitable cronning interval has to be used. It is recommended to use
24 hour interval which is sufficient enough for intended usage. The value is set
in the configuration file of the cleaner.
## REST API
The REST API is a PHP application run in an HTTP server. Its purpose is
providing controlled access to the evaluation backend and storing the state of
the application.
@todo: what to mention
- basic - GET, POST, JSON, Header, ...
- endpoint structure, Swager UI
- handling requests, preflight, checking roles with annotation
- Uploading files and file storage - one by one upload endpoint. Explain
different types of the Uploaded files.
- Automatic detection of the runtime environment - users must submit
correctly named files, assuming the RTE from the extensions
### Used Technologies
We chose to use PHP in version 7.0, which was the most recent version at the
time of starting the project. The most notable new feature is optional static
typing of function parameters and return values. We use this as much as possible
to enable easy static analysis with tools like PHPStan. Using static analysis
leads to less error-prone code that does not need as many tests as code that
uses duck typing and relies on automatic type conversions. We aim to keep our
codebase compatible with new releases of PHP.
To speed up the development and to make it easier to follow best practices, we
decided to use the Nette framework. The framework itself is focused on creating
applications that render HTML output, but a lot of its features can be used in a
REST application, too.
Doctrine 2 ORM is used to provide a layer of abstraction over storing objects in
a database. This framework also makes it possible to change the database server.
The current implementation uses MariaDB, an open-source fork of MySQL.
To communicate with the evaluation backend, we need to use ZeroMQ. This
functionality is provided by the `php_zmq` plugin that is shipped with most PHP
distributions.
### Data model
We decided to use a code-first approach when designing our data model. This
approach is greatly aided by the Doctrine 2 ORM framework, which works with
entities -- PHP classes for which we specify which attributes should be
persisted in a database. The database schema is generated from the entity
classes. This way, the exact details of how our data is stored is a secondary
concern for us and we can focus on the implementation of the business logic
instead.
The rest of this section is a description of our data model and how it relates
to the real world. All entities are stored in the `App\Model\Entity` namespace.
There are repository classes that are used to work with entities without calling
the Doctrine `EntityManager` directly. These are in the `App\Model\Repository`
namespace.
#### User Account Management
The `User` entity class contains data about users registered in ReCodEx. To
allow extending the system with additional authentication methods, login details
are stored in separate entities. There is the `Login` entity class which
contains a user name and password for our internal authentication system, and
the `ExternalLogin` entity class, which contains an identifier for an external login
service such as LDAP. Currently, each user can only have a single authentication
method (account type). The entity with login information is created along with
the `User` entity when a user signs up. If a user requests a password reset, a
`ForgottenPassword` entity is created for the request.
A user needs a way to adjust settings such as their preferred language or theme.
This is the purpose of the `UserSettings` entity class. Each possible option has
its own attribute (database column). Current supported options are `darkTheme`,
`defaultLanguage` and `vimMode`
Every user has a role in the system. The basic ones are student, supervisor and
administrator, but new roles can be created by adding `Role` entities. Roles can
have permissions associated with them. These associations are represented by
`Permission` entitites. Each permission consists of a role, resource, action and
an `isAllowed` flag. If the `isAllowed` flag is set to true, the permission is
positive (lets the role access the resource), and if it is false, it denies
access. The `Resource` entity contains just a string identifier of a resource
(e.g., group, user, exercise). Action is another string that describes what the
permission allows or denies for the role and resource (e.g., edit, delete,
view).
The `Role` entity can be associated with a parent entity. If this is the case,
the role inherits all the permissions of its parent.
All actions done by a user are logged using the `UserAction` entity for
debugging purposes.
#### Instances and Groups
Users of ReCodEx are divided into groups that correspond to school lab groups
for a single course. Each group has a textual name and description. It can have
a parent group so that it is possible to create tree hierarchies of groups.
Group membership is realized using the `GroupMembership` entity class. It is a
joining entity for the `Group` and `User` entities, but it also contains
additional information, most importantly `type`, which helps to distinguish
students from group supervisors.
Groups are organized into instances. Every `Instance` entity corresponds to an
organization that uses the ReCodEx installation, for example a university or a
company that organizes programming workshops. Every user and group belong to
exactly one instance (users choose an instance when they create their account).
Every instance can be associated with multiple `Licence` entities. Licences are
used to determine whether an instance can be currently used (access to those
without a valid instance will be denied). They can correspond to billing periods
if needed.
#### Exercises
The `Exercise` entity class is used to represent exercises -- programming tasks
that can be assigned to student groups. It contains data that does not relate to
this "concrete" assignment, such as the name, version and a private description.
Some exercise descriptions need to be translated into multiple languages.
Because of this, the `Exercise` entity is associated with the
`LocalizedAssignment` entity, one for each translation of the text.
An exercise can support multiple programming runtime environments. These
environments are represented by `RuntimeEnvironment` entities. Apart from a name
and description, they contain details of the language and operating system that
is being used. There is also a list of extensions that is used for detecting
which environment should be used for student submissions.
`RuntimeEnvironment` entities are not linked directly to exercises. Instead,
the `Exercise` entity has an M:N relation with the `SolutionRuntimeConfig`,
which is associated with `RuntimeEnvironment`. It also contains a path to a job
configuration file template that will be used to create a job configuration file
for the worker that processes solutions of the exercise.
Resource limits are stored outside the database, in the job configuration file
template.
#### Reference Solutions
To make setting resource limits objectively possible for a potentially diverse
set of worker machines, there should be multiple reference solutions for every
exercise in all supported languages that can be used to measure resource usage
of different approaches to the problem on various hardware and platforms.
Reference solutions are contained in `ReferenceSolution` entities. These
entities can have multiple `ReferenceSolutionEvaluation` entities associated
with them that link to evaluation results (`SolutionEvaluation` entity). Details
of this structure will be described in the section about student solutions.
#### Assignments
The `Assignment` entity is created from an `Exercise` entity when an exercise is
assigned to a group. Most details of the exercise can be overwritten (see the
reference documentation for a detailed overview). Additional information such as
deadlines or point values for individual tests is also configured for the
assignment and not for an exercise.
Assignments can also have their own `LocalizedAssignment` entities. If the
assignment texts are not changed, they are shared between the exercise and its
assignment.
Runtime configurations can be also changed for the assignment. This way, a
supervisor can for example alter the resource limits for the tests.
#### Student Solutions
#### Comment threads
### Request Handling
A typical scenario for handling an API request is matching the HTTP request with
a corresponding handler routine which creates a response object, that is then
sent back to the client, encoded with JSON. The `Nette\Application` package can
be used to achieve this with Nette, although it is meant to be used mainly in
MVP applications.
Matching HTTP requests with handlers can be done using standard Nette URL
routing -- we will create a Nette route for each API endpoint. Using the routing
mechanism from Nette logically leads to implementing handler routines as Nette
Presenter actions. Each presenter should serve logically related endpoints.
The last step is encoding the response as JSON. In `Nette\Application`, HTTP
responses are returned using the `Presenter::sendResponse()` method. We decided
to write a method that calls `sendResponse` internally and takes care of the
encoding. This method has to be called in every presenter action. An alternative
approach would be using the internal payload object of the presenter, which is
more convenient, but provides us with less control.
### Authentication
@todo
### Permissions
In a system storing user data has to be implemented some kind of permission
checking. Each user has a role, which corresponds to his/her privileges.
Our research showed, that three roles are sufficient -- student, supervisor
and administrator. The user role has to be
checked with every request. The good points is, that roles nicely match with
granularity of API endpoints, so the permission checking can be done at the
beginning of each request. That is implemented using PHP annotations, which
allows to specify allowed user roles for each request with very little of code,
but all the business logic is the same, together in one place.
However, roles cannot cover all cases. For example, if user is a supervisor, it
relates only to groups, where he/she is a supervisor. But using only roles
allows him/her to act as supervisor in all groups in the system. Unfortunately,
this cannot be easily fixed using some annotations, because there are many
different cases when this problem occurs. To fix that, some additional checks
can be performed at the beginning of request processing. Usually it is only one
or two simple conditions.
With this two concepts together it is possible to easily cover all cases of
permission checking with quite a small amount of code.
### Uploading Files
There are two cases when users need to upload files using the API -- submitting
solutions to an assignment and creating a new exercise. In both of these cases,
the final destination of the files is the fileserver. However, the fileserver is
not publicly accessible, so the files have to be uploaded through the API.
Each file is uploaded separately and is given a unique ID. The uploaded file
can then be attached to an exercise or a submitted solution of an exercise.
Storing and removing files from the server is done through the
`App\Helpers\UploadedFileStorage` class which maps the files to their records
in the database using the `App\Model\Entity\UploadedFile` entity.
### Forgotten Password
When user finds out that he/she does not remember a password, he/she requests a
password reset and fills in his/her unique email. A temporary access token is
generated for the user corresponding to the given email address and sent to this
address encoded in a URL leading to a client application. User then goes
to the URL and can choose a new password.
The temporary token is generated and emailed by the
`App\Helpers\ForgottenPasswordHelper` class which is registered as a service
and can be injected into any presenter.
This solution is quite safe and user can handle it on its own, so administrator
does not have to worry about it.
### Job Configuration Parsing and Modifying
@todo how the YAML is parsed
@todo how it can be changed and where it is used
@todo how it can be stored to a new YAML
### Solution Loading
When a solution evaluation is finished by the backend, the results are saved to
the fileserver and the API is notified by the broker. The results are parsed and
stored in the database.
For the results of the evaluations of the reference solutions and for the
asynchronously evaluated solutions of the students (e.g., resubmitted by
the administrator) the result is processed
right after the notification from backend is received and the author of the
solution will be notified by an email after the results are processed.
When a student submits his/her solution directly through the client application
we do not parse the results right away but we postpone this until the student
(or a supervisor) wants to display the results for the first time. This may save
save some resources when the solution results are not important (e.g., the
student finds a bug in his solution before the submission has been evaluated).
#### Parsing of The Results
The results are stored in a YAML file. We map the contents of the file to the
classes of the `App\Helpers\EvaluationResults` namespace. This process
validates the file and gives us access to all of the information through
an interface of a class and not only using associative arrays. This is very
similar to how the job configuration files are processed.
## API Endpoints
@todo: Tell the user about the generated API reference and how the
Swagger UI can be used to access the API directly.
## Web Application
The whole project is written using the next generation of JavaScript referred to
as *ECMAScript 6* (also known as *ES6*, *ES.next*, or *Harmony*). Since not all
of the features introduced in this standard are implemented in the modern
web browsers of today (like classes and the spread operator) and hardly any are
implemented in the older versions of the web browsers which are currently still
in use, the source code is transpiled into the older standard *ES5* using
[Babel.js](https://babeljs.io/) transpiler and bundled into a single script file
using the [webpack](https://webpack.github.io/) moudle bundler. The need for a
transpiler also arises from the usage of the *JSX* syntax for declaring React
components. To read more about these these tools and their usage please refer to
the [installation documentation](#Installation). The whole bundling process
takes place at deployment and is not repeated afterwards when running in
production.
### State Management
Web application is a SPA (Single Page Application. When the user accesses the page,
the source codes are downloaded and are interpreted by the web browser. The communication
between the browser and the server then runs in the background without reloading the page.
The application keeps its internal state which can be altered by the actions of the user
(e.g., clicking on links and buttons, filling input fields of forms) and by the outcomes
of HTTP requests to the API server. This internal state is kept in memory of the web
browser and is not persisted in any way -- when the page is refreshed, the internal state
is deleted and a new one is created from scratch (i.e., all of the data is fetched from the
API server again).
The only part of the state which is persisted is the token of the logged in user. This token
is kept in cookies and in the local storage. Keeping the token in the cookies is necessary
for server-side rendering.
#### Redux
The in-memory state is handled by the *redux* library. This library is storngly insipred
by the [Flux](https://facebook.github.io/flux/) architecture but it has some specifics.
The whole state is in a single serializable tree structure called the *store*. This store
can be modified only by dispatching *actions* which are Plain Old JavaScript Oblects (POJO)
which are processed by *reducers*. A reducer is a pure function which takes the state
object and the action object and it creates a new state. This process is very easy to
reason about and is also very easy to test using unit tests. Please read the
[redux documentation](http://redux.js.org/) for detailed information about the library.
![Redux state handling schema](https://github.com/ReCodEx/wiki/raw/master/images/redux.png)
##### Redux Middleware
A middleware in redux is a function which can process actions before they are passed
to the reducers to update the state.
The middleware used by the ReCodEx store is defined in the `src/redux/store.js` script.
Several open source libraries are used:
- [redux-promise-middleware](https://github.com/pburtchaell/redux-promise-middleware)
- [redux-thunk](https://github.com/gaearon/redux-thunk)
- [react-router-redux](https://github.com/reactjs/react-router-redux)
We created two other custom middleware functions for our needs:
- **API middleware** -- The middleware filters out all actions with the *type* set to `recodex-api/CALL`,
sends a real HTTP request according to the information in the action.
- **Access Token Middleware** -- This middleware persists the access token each time after the
user signs into the application into the local storage and the cookies. The token
is removed when the user decides to sign out. The middleware also attaches the token
to each `recodex-api/CALL` action when it does not have an access token set explicitly.
##### Accessing The Store Using Selectors
@todo
#### Routing
The page should not be reloaded after the initial render but the current location
of the user in the system must be reflected in the URL. This is achieved through the
[react-router](https://github.com/ReactTraining/react-router) and
[react-router-redux](https://github.com/reactjs/react-router-redux) libraries.
These libraries use `pushState`
method of the `history` object, a living standard supported by all of the modern browsers.
The mapping of the URLs to the components is defined in the `src/pages/routes.js` file.
To create links between pages, use either the `Link` component from the `react-router` library
or dispatch an action created using the `push` action creator from the `react-router-redux`
library. All the navigations are mapped to redux actions and can be handled by any reducer.
Having up-to-date URLs gives the users the possibility to reload the page if some
error occurs on the page and land at the same page as he or she would expect. Users can also
send links to the very page they want to.
### Creating HTTP Requests
All of the HTTP requests are made by dispatching a specific action which will be processed
by our custom *API middleware*. The action must have the *type* property set to
`recodex-api/CALL`. The middleware catches the action and it sends a real HTTP request
created according to the information in the `request` property of the action:
- **type** -- Type prefix of the actions which will be dispatched automatically during the
lifecycle of the request (pending, fulfilled, failed).
- **endpoint** -- The URI to which the request should be sent. All endpoints will be prefixed
with the base URL of the API server.
- **method** (*optional*) -- A string containing the name of the HTTP method which should be used.
The default method is `GET`.
- **query** (*optional*) -- An object containing key-value pairs which will be put
in the URL of the request in the query part of the URL.
- **headers** (*optional*) -- An object containing key-value paris which will be appended
to the headers of the HTTP request.
- **accessToken** (*optional*) -- Explicitly set the access token for the request. The token
will be put in the *Authorization* header.
- **body** (*optional*) -- An object or an array which will be recursively flattened into
the `FormData` structure with correct usage of square brackets for nested (associative)
arrays. It is worth mentioning that the keys must not contain a colon in the string.
- **doNotProcess** (*optional*) -- A boolean value which can disable the default procesing
of the response to the request which includes showing a notification to the user in case
of a failiure of the request. All requests are processed in the way described above
by default.
The HTTP requests are sent using the `fetch` API which returns a *Promise* of the request.
This promise is put into a new action and creates a new action containing the promise
and the type specified in the `request` description. This action is then caught by the
promise middleware and the promise middleware dispatches actions whenever the state of
the promise changes during its the lifecycle. The new actions have specific types:
- {$TYPE}_PENDING -- Dispatched immediatelly after the action is processed by the promise
middleware. The `payload` property of the action contains the body of the request.
- {$TYPE}_FAILED -- Dispatched if the promise of the request is rejected.
- {$TYPE}_FULFILLED -- Dispatched when the response to the request is received and the
promise is resolved. The `payload` property of the action contains the body of the
HTTP response parsed as JSON.
### Routine CRUD Operations
For routine CRUD (Create, Read, Update, Delete) operations which are common to most
of the resources used in the ReCodEx (e.g., groups, users, assignments, solutions,
solution evaluations, source code files) a set of functions called *Resource manager*
was implemented. It contains a factory which creates basic actions (e.g., `fetchResource`,
`addResource`, `updateResource`, `removeResource`, `fetchMany`) and handlers for
all of the lifecycle actions created by both the API middleware and the promise middleware
which can be used to create a basic reducer.
The *resource manager* is spread over several files in the `src/redux/helpers/resourceManager`
directory and is covered with unit tests in scripts located at
`test/redux/helpers/resourceManager`.
### Servier-side Rendering
To speed-up the initial time of rendering of the web application a technique called server-side
rendering (SSR) is used. The same code which is executed in the web browser of the client can run
on the server using [Node.js](https://nodejs.org). React can serialize its HTML output into
a string which can be sent to the client and can be displayed before the (potentially large)
JavaScript source code starts being executed by the browser. The redux store is in fact
just a large JSON tree which can be easily serialized as well.
If the user is logged in then the access token should be in the cookies of the web browser
and it should be attached to the HTTP request when the user navigates to the ReCodEx
web page. This token is then put into the redux store and so the user is logged in on
the server.
The whole logic of the SSR is in a single file called `src/server.js`. It contains only
a definition of a simple HTTP server (using the [express](http://expressjs.com/) framework)
and some necessary boilerplate of the routing library.
All the components which are associated to the matched route can have a class property `loadAsync`
which should contain a function returning a *Promise*. The SRR calls all these functions and delays
the response of the HTTP server until all of the promises are resolved (or some of them fails).
## Communication Protocol
Detailed communication inside the ReCodEx system is captured in the following
image and described in sections below. Red connections are through ZeroMQ
sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ
messages are sent as multipart with one string (command, option) per part, with
no empty frames (unles explicitly specified otherwise).
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
### Broker - Worker Communication
Broker acts as server when communicating with worker. Listening IP address and
port are configurable, protocol family is TCP. Worker socket is of DEALER type,
broker one is ROUTER type. Because of that, very first part of every (multipart)
message from broker to worker must be target the socket identity of the worker
(which is saved on its **init** command).
#### Commands from Broker to Worker:
- **eval** -- evaluate a job. Requires 3 message frames:
- `job_id` -- identifier of the job (in ASCII representation -- we avoid
endianness issues and also support alphabetic ids)
- `job_url` -- URL of the archive with job configuration and submitted
source code
- `result_url` -- URL where the results should be stored after evaluation
- **intro** -- introduce yourself to the broker (with **init** command) -- this
is required when the broker loses track of the worker who sent the command.
Possible reasons for such event are e.g. that one of the communicating sides
shut down and restarted without the other side noticing.
- **pong** -- reply to **ping** command, no arguments
#### Commands from Worker to Broker:
- **init** -- introduce self to the broker. Useful on startup or after
reestablishing lost connection. Requires at least 2 arguments:
- `hwgroup` -- hardware group of this worker
- `header` -- additional header describing worker capabilities. Format must
be `header_name=value`, every header shall be in a separate message frame.
There is no limit on number of headers. There is also an optional third
argument -- additional information. If present, it should be separated
from the headers with an empty frame. The format is the same as headers.
Supported keys for additional information are:
- `description` -- a human readable description of the worker for
administrators (it will show up in broker logs)
- `current_job` -- an identifier of a job the worker is now processing. This
is useful when we are reassembling a connection to the broker and need it
to know the worker will not accept a new job.
- **done** -- notifying of finished job. Contains following message frames:
- `job_id` -- identifier of finished job
- `result` -- response result, possible values are:
- OK -- evaluation finished successfully
- FAILED -- job failed and cannot be reassigned to another worker (e.g.
due to error in configuration)
- INTERNAL_ERROR -- job failed due to internal worker error, but another
worker might be able to process it (e.g. downloading a file failed)
- `message` -- a human readable error message
- **progress** -- notice about current evaluation progress. Contains following
message frames:
- `job_id` -- identifier of current job
- `command` -- what is happening now.
- DOWNLOADED -- submission successfuly fetched from fileserver
- FAILED -- something bad happened and job was not executed at all
- UPLOADED -- results are uploaded to fileserver
- STARTED -- evaluation of tasks started
- ENDED -- evaluation of tasks is finished
- ABORTED -- evaluation of job encountered internal error, job will be
rescheduled to another worker
- FINISHED -- whole execution is finished and worker ready for another
job execution
- TASK -- task state changed -- see below
- `task_id` -- only present for "TASK" state -- identifier of task in
current job
- `task_state` -- only present for "TASK" state -- result of task
evaluation. One of:
- COMPLETED -- task was successfully executed without any error,
subsequent task will be executed
- FAILED -- task ended up with some error, subsequent task will be
skipped
- SKIPPED -- some of the previous dependencies failed to execute, so
this task will not be executed at all
- **ping** -- tell broker I am alive, no arguments
#### Heartbeating
It is important for the broker and workers to know if the other side is still
working (and connected). This is achieved with a simple heartbeating protocol.
The protocol requires the workers to send a **ping** command regularly (the
interval is configurable on both sides -- future releases might let the worker
send its ping interval with the **init** command). Upon receiving a **ping**
command, the broker responds with **pong**.
Whenever a heartbeating message doesn't arrive, a counter called _liveness_ is
decreased. When this counter drops to zero, the other side is considered
disconnected. When a message arrives, the liveness counter is set back to its
maximum value, which is configurable for both sides.
When the broker decides a worker disconnected, it tries to reschedule its jobs
to other workers.
If a worker thinks the broker crashed, it tries to reconnect periodically, with
a bounded, exponentially increasing delay.
This protocol proved great robustness in real world testing. Thus whole backend
is reliable and can outlive short term issues with connection without problems.
Also, increasing delay of ping messages does not flood the network when there
are problems. We experienced no issues since we are using this protocol.
### Worker - File Server Communication
Worker is communicating with file server only from _execution thread_. Supported
protocol is HTTP optionally with SSL encryption (**recommended**). If supported
by server and used version of libcurl, HTTP/2 standard is also available. File
server should be set up to require basic HTTP authentication and worker is
capable to send corresponding credentials with each request.
#### Worker Side
Workers comunicate with the file server in both directions -- they download the
submissions of the student and then upload evaluation results. Internally,
worker is using libcurl C library with very similar setup. In both cases it can
verify HTTPS certificate (on Linux against system cert list, on Windows against
downloaded one from CURL website during installation), support basic HTTP
authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error
(returned HTTP status code is >=400). Worker have list of credentials to all
available file servers in its config file.
- download file -- standard HTTP GET request to given URL expecting file content
as response
- upload file -- standard HTTP PUT request to given URL with file data as body
-- same as command line tool `curl` with option `--upload-file`
#### File server side
File server has its own internal directory structure, where all the files are
stored. It provides simple REST API to get them or create new ones. File server
does not provide authentication or secured connection by itself, but it is
supposed to run file server as WSGI script inside a web server (like Apache)
with proper configuration. Relevant commands for communication with workers:
- **GET /submission_archives/\<id\>.\<ext\>** -- gets an archive with submitted
source code and corresponding configuration of this job evaluation
- **GET /exercises/\<hash\>** -- gets a file, common usage is for input files or
reference result files
- **PUT /results/\<id\>.\<ext\>** -- upload archive with evaluation results
under specified name (should be same _id_ as name of submission archive). On
successful upload returns JSON `{ "result": "OK" }` as body of returned page.
If not specified otherwise, `zip` format of archives is used. Symbol `/` in API
description is root of the domain of the file server. If the domain is for
example `fs.recodex.org` with SSL support, getting input file for one task could
look as GET request to
`https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c`.
### Broker - Monitor Communication
Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
socket is same on both sides, ROUTER. Monitor is set to act as server in this
communication, its IP address and port are configurable in the config of the
monitor file. ZeroMQ socket ID (set on the side of the monitor) is
"recodex-monitor" and must be sent as first frame of every multipart message --
see ZeroMQ ROUTER socket documentation for more info.
Note that the monitor is designed so that it can receive data both from the
broker and workers. The current architecture prefers the broker to do all the
communication so that the workers do not have to know too many network services.
Monitor is treated as a somewhat optional part of whole solution, so no special
effort on communication realibility was made.
#### Commands from Monitor to Broker:
Because there is no need for the monitor to communicate with the broker, there
are no commands so far. Any message from monitor to broker is logged and
discarded.
#### Commands from Broker to Monitor:
- **progress** -- notification about progress with job evaluation. This
communication is only redirected as is from worker, more info can be found in
"Broker - Worker Communication" chapter above.
### Broker - Web API Communication
Broker communicates with main REST API through ZeroMQ connection over TCP.
Socket type on broker side is ROUTER, on frontend part it is DEALER. Broker acts
as a server, its IP address and port is configurable in the API.
#### Commands from API to Broker:
- **eval** -- evaluate a job. Requires at least 4 frames:
- `job_id` -- identifier of this job (in ASCII representation -- we avoid
endianness issues and also support alphabetic ids)
- `header` -- additional header describing worker capabilities. Format must
be `header_name=value`, every header shall be in a separate message frame.
There is no maximum limit on number of headers. There may be also no
headers at all. A worker is considered suitable for the job if and only if
it satisfies all of its headers.
- empty frame -- frame which contains only empty string and serves only as
breakpoint after headers
- `job_url` -- URI location of archive with job configuration and submitted
source code
- `result_url` -- remote URI where results will be pushed to
#### Commands from Broker to API:
All are responses to **eval** command.
- **ack** -- this is first message which is sent back to frontend right after
eval command arrives, basically it means "Hi, I am all right and am capable of
receiving job requests", after sending this broker will try to find acceptable
worker for arrived request
- **accept** -- broker is capable of routing request to a worker
- **reject** -- broker cannot handle this job (for example when the requirements
specified by the headers cannot be met). There are (rare) cases when the
broker finds that it cannot handle the job after it was confirmed. In such
cases it uses the frontend REST API to mark the job as failed.
#### Asynchronous Communication Between Broker And API
Only a fraction of the errors that can happen during evaluation can be detected
while there is a ZeroMQ connection between the API and broker. To notify the
frontend of the rest, the API exposes an endpoint for the broker for this
purpose. Broker uses this endpoint whenever the status of a job changes
(it is finished, it failed permanently, the only worker capable of processing
it disconnected...).
When a request for sending a report arrives from the backend then the type of
the report is inferred and if it is an error which deserves attention of the
administrator then an email is sent to him/her. There can also be errors which
are not that important (e.g., it was somehow solved by the backend itself or it
is only informative), then these do not have to be reported through an email but
they are stored in the persistent database for further consideration.
For the details of this interface please refer to the attached API documentation
and the `broker-reports/` endpoint group.
### File Server - Web API Communication
File server has a REST API for interaction with other parts of ReCodEx.
Description of communication with workers is in "Worker - File Server
Communication" chapter above. On top of that, there are other commands for
interaction with the API:
- **GET /results/\<id\>.\<ext\>** -- download archive with evaluated results of
job _id_
- **POST /submissions/\<id\>** -- upload new submission with identifier _id_.
Expects that the body of the POST request uses file paths as keys and the
content of the files as values. On successful upload returns JSON `{
"archive_path": <archive_url>, "result_path": <result_url> }` in response
body. From _archive_path_ the submission can be downloaded (by worker) and
corresponding evaluation results should be uploaded to _result_path_.
- **POST /tasks** -- upload new files, which will be available by names equal to
`sha1sum` of their content. There can be uploaded more files at once. On
successful upload returns JSON `{ "result": "OK", "files": <file_list> }` in
response body, where _file_list_ is dictionary of original file name as key
and new URL with already hashed name as value.
There are no plans yet to support deleting files from this API. This may change
in time.
Web API calls these fileserver endpoints with standard HTTP requests. There are
no special commands involved. There is no communication in opposite direction.
### Monitor - Web App Communication
Monitor interacts with web application through WebSocket connection. Monitor
acts as server and browsers are connecting to it. IP address and port are
configurable. When client connects to the monitor, it sends a message with
string representation of channel id (which messages are interested in, usually
id of evaluating job). There can be multiple listeners per channel, even
(shortly) delayed connections will receive all messages from the very beginning.
When monitor receives **progress** message from broker there are two options:
- there is no WebSocket connection for listed channel (job id) -- message is
dropped
- there is active WebSocket connection for listed channel -- message is parsed
into JSON format (see below) and send as string to that established channel.
Messages for active connections are queued, so no messages are discarded even
on heavy workload.
Message from monitor to web application is in JSON format and it has form of
dictionary (associative array). Information contained in this message should
correspond with the ones given by worker to broker. For further description
please read more in "Broker - Worker communication" chapter under "progress"
command.
Message format:
- **command** -- type of progress, one of: DOWNLOADED, FAILED, UPLOADED,
STARTED, ENDED, ABORTED, FINISHED, TASK
- **task_id** -- id of currently evaluated task. Present only if **command** is
"TASK".
- **task_state** -- state of task with id **task_id**. Present only if
**command** is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".
### Web App - Web API Communication
The provided web application runs as a JavaScript process inside the browser of
the user. It communicates with the REST API on the server through the standard
HTTP requests. Documentation of the main REST API is in a separate
[document](https://recodex.github.io/api/) due to its extensiveness. The results
are returned encoded in JSON which is simply processed by the web application
and presented to the user in an appropriate way.
<!---
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
-->