You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
recodex-wiki/Rewritten-docs.md

3044 lines
157 KiB
Markdown

<!---
Notes:
* Dvoustrankovy uvod - co by to melo umet
* Analýza - co se rozhodneme delat, jak by se to dalo delat, pridelit dulezitost
- pak se da odkazat na to, proc jsme co nestihli, zahrnout i advanced featury
- odkazovat se u featur, ze to je v planu v pristich verzi - co je dulezite
a co ne!! Zduvodnit tim, jakou podmnozinu featur nechat, snaze se pak bude
popisovat architektura
* V analyze vysvetlit architekturu
* Related works nechat jako samostatnou kapitolu
* Poradi - pozadavky -> related works -> analyza
* Provazani komponent musi rozumet administrator a tvurce ulohy - obecna
kapitola v analyze - puvodni kapitola o analyze byla povedena, jen se tam
micha seznam zprav nebo co - to nezajima vsechny
* Po obecnym uvodu - rozdelit podle potencialniho ctenare - uzivatel ucitel, pak
uzivatel admin
* Instalacni dokumentace stranou, jako posledni
* Uzivatelaka dokumentace - admin: popis prav, autor uloh: nejobsahlejsi, format
skriptu - ale formulovat tak, ze bude popis na co kde kliknout, jazyk popsat
separatne - v budoucnu to bude irelevantni, je potreba daleko hloubeji - je
treba popsat detailne co eelaji, i treba relativni/absolutni adresy, makra,
kde vidi prekladac knihovny a headery... - kapitola na konci
* Uzivatelska dokumentace pro studenta: vysvetleni
* Jak se boduje uloha - tezko rict, kam to patri - nekde na zacatku? Ale zajima
to vsechny role, ucitel musi vedet, jak to nakonfigurovat - zminit treba i jak
bodovat podle casu a pameti (v analyze nebo v uvodu) - vice vystupu od judge,
interpolace bodu podle vyuziti pameti... je to spis mimo uživatelskou
* Nepsat kde na jake tlacitko kliknout
* Tutorialy - scenare, co udelat kdyz chci neco, vzorove pruchody
* U formularu je nejlepsi kdyz zadna dokumentace neni, doplnit popisky k polim
formularu
* V dokumentaci popsat konfigy nekde separatne - skore, yaml - referencni
dokumentace
* Urcite ne FAQ, vic strukturovane
* Instalaci dohromady na konec
* Programatorska dokumentace - "nejmene ctenaru" - neco uz tam mame, neni to
treba davat do tistene dokumentace - do tistene dokumentace dat odkaz na wiki,
neco v tistene ale byt musi - jaky jazyk, designové rozhodnutí - zdůvodnění
nedávat do úvodní analýzy - k referencnim dokumentacim udelat uvod - "restove
API jsme pojali timto zpusobem, deli se to na tyto skupiny, ..."
* Co zvolena architektura znamena, neco to ma dat i uzivateli, ktery
architekturu nezna, kde je drzenej stav
* Z dokumentace musi byt patrne, co dela knihovna a co se musi udelat rucne -
kolik je to prace - psat to vic pro uzivatele, ktery zna technologie, nezna
knihovny
* Mit soucit s tema, ktery to toho tolik neznaji - jak technologie, tak
architekturu a system CodExu
* Nesedi cisla stranek
* Stazeni ZIPu s vystupy Backendu - roztridit na verejne a tajne, verejne i pro
studenta
-->
# Introduction
8 years ago
Generally, there are many different ways and opinions on how to teach people
something new. However, most people agree that a hands-on experience is one of
the best ways to make the human brain remember a new skill. Learning must be
entertaining and interactive, with fast and frequent feedback. Some kinds of
knowledge are more suitable for this practical type of learning than others, and
fortunately, programming is one of them.
University education system is one of the areas where this knowledge can be
applied. In computer programming, there are several requirements a program
8 years ago
should satisfy, such as the code being syntactically correct, efficient and easy
to read, maintain and extend.
Checking programs written by students takes time and requires a lot of
mechanical, repetitive work -- reviewing source codes, compiling them and
running them through testing scenarios. It is therefore desirable to automate as
much of this work as possible. The first idea of an automatic evaluation system
8 years ago
comes from Stanford University professors in 1965. They implemented a system
which evaluated code in Algol submitted on punch cards. In following years, many
similar products were written.
In today's world, properties like correctness and efficiency can be tested
automatically to a large extent. This fact should be exploited to help teachers
save time for tasks such as examining bad design, bad coding habits and logical
mistakes, which are difficult to perform automatically.
There are two basic ways of automatically evaluating code -- statically
8 years ago
(checking the source code without running it; safe, but not very precise) or
dynamically (running the code on test inputs and checking the correctness of
outputs ones; provides good real world experience, but requires extensive
security measures).
This project focuses on the machine-controlled part of source code evaluation.
8 years ago
First, general concepts of grading systems are observed and problems of the
software previously used at Charles University in Prague are briefly discussed.
Then new requirements are specified and projects with similar functionality are
examined. With acquired knowledge from such projects in production, we set up
8 years ago
goals for the new evaluation system, designed the architecture and implemented a
fully operational solution based on dynamic evaluation. The system is now ready
for production testing at the university.
## Assignment
8 years ago
8 years ago
The major goal of this project is to create a grading application that will be
8 years ago
used for programming classes at the Faculty of Mathematics and Physics of the
Charles University in Prague. However, the application should be designed in a
8 years ago
modular fashion to be easily extended or even modified to make other ways of
8 years ago
usage possible.
The system should be capable of dynamic analysis of submitted source codes. This
consists of following basic steps:
8 years ago
1. compile the code and check for compilation errors
2. run compiled binary in a sandbox with predefined inputs
3. check constraints on used amount of memory and time
8 years ago
4. compare program outputs with predefined values
5. award the code with a numeric score
8 years ago
8 years ago
The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind the typical usage scenarios of
the system and to try to make these tasks as simple as possible. To fulfil this
task, the project has a great starting point -- there is an old grading system
currently used at the university (CodEx), so its flaws and weaknesses can be
8 years ago
addressed. Furthermore, many teachers desire to use and test the new system and
8 years ago
they are willing to consult ideas or problems during development with us.
8 years ago
## Current system
8 years ago
8 years ago
The grading solution currently used at the Faculty of Mathematics and Physics of
the Charles University in Prague was implemented in 2006 by a group of students.
It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
and it has been used with some improvements since then. The original plan was to
use the system only for basic programming courses, but there was a demand for
adapting it for many different subjects.
CodEx is based on dynamic analysis. It features a web-based interface, where
supervisors can assign exercises to their students and the students have a time
window to submit their solutions. Each solution is compiled and run in sandbox
(MO-Eval). The metrics which are checked are: correctness of the output, time
and memory limits. It supports programs written in C, C++, C#, Java, Pascal,
Python and Haskell.
8 years ago
The system has a database of users. Each user is assigned a role, which
corresponds to his/her privileges. There are user groups reflecting the
structure of lectured courses.
A database of exercises (algorithmic problems) is another part of the project.
Each exercise consists of a text describing the problem (optionally in two
language variants -- Czech and English), an evaluation configuration
(machine-readable instructions on how to evaluate solutions to the exercise) and
a set of inputs and reference outputs. Exercises are created by instructed
privileged users. Assigning an exercise to a group means choosing one of the
available exercises and specifying additional properties: a deadline (optionally
a second deadline), a maximum amount of points, a configuration for calculating
the score, a maximum number of submissions, and a list of supported runtime
environments (e.g. programming languages) including specific time and memory
limits for each one.
8 years ago
Typical use cases for supported user roles are following:
- **student**
- create new user account via registration form
- join a group
- get assignments in group
8 years ago
- submit solution to assignment -- upload one source file and trigger
evaluation process
- view solution results -- which parts succeeded and failed, total number of
acquired points, bonus points
- **supervisor**
8 years ago
- create exercise -- create description text and evaluation configuration
(for each programming environment), upload testing inputs and outputs
- assign exercise to group -- choose exercise and set deadlines, number of
allowed submissions, weights of all testing cases and amount of points for
correct solutions
- modify assignment
- view all results in group
8 years ago
- check automatic solution grading -- view submitted source and optionally
set bonus points
- **administrator**
- create groups
8 years ago
- alter user privileges -- make supervisor accounts
- check system logs, upgrades and other management
8 years ago
### Exercise evaluation chain
8 years ago
The most important part of the system is evaluation of solutions submitted by
students. Concepts of consecutive steps from source code to final results
is described in more detail below to give readers solid overview of what have to
happen during evaluation process.
First thing students have to do is to submit their solutions through web user
8 years ago
interface. The system checks assignment invariants (deadlines, count of
submissions, ...) and stores the submitted code. The runtime environment is
automatically detected based on input file extension and a suitable evaluation
configuration variant is chosen (one exercise can have multiple variants, for
example C and Java languages). This exercise configuration is then used for
taking care of evaluation process.
8 years ago
There is a pool of uniform worker engines dedicated to evaluation jobs. Incoming
jobs are kept in a queue until a free worker picks them. Worker is capable of
sequential evaluation of jobs, one at a time.
The worker obtains the solution and its evaluation configuration, parses it and
starts executing the contained instructions. It is crucial to keep the worker
computer secure and stable, so a sandboxed environment is used for dealing with
unknown source code. When the execution is finished, results are saved and the
submitter is notified.
The output of the worker contains data about the evaluation, such as time and
memory spent on running the program for each test input and whether its output
was correct. The system then calculates a numeric score from this data, which is
presented to the student. If the solution is wrong (incorrect output, uses too
much memory,..), error messages are also displayed to the submitter.
8 years ago
8 years ago
### Weaknesses
Current system is old, but robust. There were no major security incidents
during its production usage. However, from today's perspective there are
several drawbacks. The main ones are:
8 years ago
- **web interface** -- The web interface is simple and fully functional. But
rapid development in web technologies opens new horizons of how web interface
can be made.
- **web API** -- CodEx offers a very limited XML API based on outdated
technologies that is not sufficient for users who would like to create custom
interfaces such as a command line tool or mobile application.
- **sandboxing** -- MO-Eval sandbox is based on principle of monitoring system
calls and blocking the bad ones. This can be easily done for single-threaded
applications, but proves difficult with multi-threaded ones. In present day,
parallelism is a very important area of computing, so there is requirement to
test multi-threaded applications too.
- **instances** -- Different ways of CodEx usage scenarios requires separate
instances (Programming I and II, Java, C#, etc.). This configuration is not
user friendly (students have to register in each instance separately) and
burdens administrators with unnecessary work. CodEx architecture does not
allow sharing hardware between instances, which results in an inefficient use
of hardware for evaluation.
- **task extensibility** -- There is a need to test and evaluate complicated
programs for classes such as Parallel programming or Compiler principles,
which have a more difficult evaluation chain than simple
compilation/execution/evaluation provided by CodEx.
8 years ago
8 years ago
## Requirements
8 years ago
There are many different formal requirements for the system. Some of them
8 years ago
are necessary for any system for source code evaluation, some of them are
specific for university deployment and some of them arose during the ten year
long lifetime of the old system. There are not many ways to improve CodEx
experience from the perspective of a student, but a lot of feature requests come
from administrators and supervisors. The ideas were gathered mostly from our
8 years ago
personal experience with the system and from meetings with faculty staff
involved with the current system.
In general, CodEx features should be preserved, so only differences are
presented here. For clear arrangement all the requirements and wishes are
presented grouped by categories.
### System features
8 years ago
System features represents directly accessible functionality to users of the
system. They describe the evaluation system in general and also university
8 years ago
addons (mostly administrative features).
#### Requirements of the users
- _group hierarchy_ -- creating an arbitrarily nested tree structure should be
supported to allow keeping related groups together, such as in the example
below. A group hierarchy also allows archiving data from past courses.
```
Summer term 2016
|-- Language C# and .NET platform
| |-- Labs Monday 10:30
| `-- Labs Thursday 9:00
|-- Programming I
| |-- Labs Monday 14:00
8 years ago
...
```
- _a database of exercises_ -- teachers should be able to create exercises
including textual description, sample inputs and correct reference outputs
(for example "sum all numbers from given file and write the result to the
standard output") and to browse this database
- _customizable grading system_ -- teachers need to specify the way of
computation of the final score, which will be awarded to the student's
submissions depending on their quality
8 years ago
- _viewing student details_ -- teachers should be able to view the details of
their students (members of their groups), including all submitted solutions
- _awarding additional points_ -- adding (or subtracting) points from the final
score of a submission by a supervisor must be supported
- _marking a solution as accepted_ -- the system should allow marking one
particular solution as accepted (used for grading the assignment) by the
supervisor
- _solution resubmission_ -- teachers should be able edit student's solutions
and privately resubmit them, optionally saving all results (including
temporary ones); this feature can be used to quickly fix errors in the
solution
- _localization_ -- all texts (UI and exercises) should be translatable
- _formatted exercise texts_ -- Markdown or another lightweight markup language
should be supported for formatting exercise texts
- _exercise tags_ -- the system should support tagging exercises searching by
these tags
- _comments_ -- adding both private and public comments to exercises, tests and
solutions should be supported
- _plagiarism detection_
8 years ago
#### Administrative requirements
8 years ago
- _pluggable user interface_ -- the system should allow using an alternative
user interface, such as a command line client; implementation of such clients
should be as straightforward as possible
- _privilege separation_ -- there should be at least two roles -- _student_ and
_supervisor_. Cases when a student of a course is also a teacher of another
lab must be handled correctly
- _alternate authentication methods_ -- logging in through a university
authentication system (e.g. LDAP) and potentially other services, such as
OAuth, should be supported
- _querying SIS_ -- loading user data from the university information system
should be supported
- _sandboxing_ -- there should be a safe environment in which the students'
solutions are executed to prevent system failures due to malicious code being
submitted; the sandboxed environment should have the least possible impact on
measurement results (most importantly on measured times)
8 years ago
- _heterogeneous worker pool_ -- there must be support for submission evaluation
in multiple programming environments in a single installation to avoid
unacceptable workload for the administrator (maintaining a separate
installation for every course) and high hardware occupation
8 years ago
- advanced low-level evaluation flow configuration with high-level abstraction
layer for ordinary configuration cases; the configuration should be able to
express more complicated flows than just compiling a source code and running
the program against test inputs -- for example, some exercises need to build
the source code with a tool, run some tests, then run the program through
another tool and perform additional tests
8 years ago
- use of modern technologies with state-of-the-art compilers
8 years ago
### Non-functional requirements
8 years ago
Non-functional requirements are requirements of technical character with no
direct mapping to visible parts of the system. In an ideal world, users should
not know about these features if they work properly, but would be at least
annoyed if they did not.
8 years ago
- _no installation_ -- the primary user interface of the system must be
accessible on users' computers without the need to install any additional
software
- _performance_ -- the system must be ready for at least hundreds of students
and tens of supervisors using it at once
- _automated deployment_ -- all of the components of the system must be easy to
deploy in an automated fashion
- _open source licensing_ -- the source code should be released under a
8 years ago
permissive licence allowing further development; this also applies to used
libraries and frameworks
- _multi-platform worker_ -- worker machines running Linux, Windows and
potentially other operating systems must be supported
### Conclusion
8 years ago
The survey shows that there are a lot of different requirements and wishes for
the new system. When the system is ready, it is likely that there will be new
8 years ago
ideas of how to use the system and thus the system must be designed to be easily
extendable, so that these new ideas can be easily implemented, either by us or
community members. This also means that widely used programming languages and
techniques should be used, so that users can quickly understand the code and
make changes.
8 years ago
8 years ago
## Related work
To find out the current state in the field of automatic grading systems, we did
a short market survey on the field of automatic grading systems at universities,
8 years ago
programming contests, and possibly other places where similar tools are
8 years ago
available.
8 years ago
This is not a complete list of available evaluators, but only a few projects
8 years ago
which are used these days and can be an inspiration for our project. Each
project from the list has a brief description and some key features mentioned.
8 years ago
### Progtest
8 years ago
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT
ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
Bash programming and knowledge-based quizzes. There are several bonus points
and penalties and also a few hints what is failing in the submitted solution. It
is very strict on source code quality, for example `-pedantic` option of GCC,
Valgrind for memory leaks or array boundaries checks via `mudflap` library.
### Codility
[Codility](https://codility.com/) is a web based solution primary targeted to
8 years ago
company recruiters. It is a commercial product available as a SaaS and it
supports 16 programming languages. The
[UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png)
8 years ago
of Codility is [opensource](https://github.com/Codility/cui), the rest of source
code is not available. One interesting feature is 'task timeline' -- captured
progress of writing code for each user.
### CMS
[CMS](http://cms-dev.github.io/index.html) is an opensource distributed system
for running and organizing programming contests. It is written in Python and
contains several modules. CMS supports C/C++, Pascal, Python, PHP, and Java
programming languages. PostgreSQL is a single point of failure, all modules
heavily depend on the database connection. Task evaluation can be only a three
step pipeline -- compilation, execution, evaluation. Execution is performed in
[Isolate](https://github.com/ioi/isolate), sandbox written by the consultant
of our project, Mgr. Martin Mareš, Ph.D.
### MOE
[MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C
and Python. It does not provide a default GUI interface, all actions have to be
performed from command line. The system does not evaluate submissions in real
time, results are computed in batch mode after exercise deadline, using Isolate
for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but
the system is generally obsolete.
### Kattis
[Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean
and functional web UI, but the rest of the application is too simple. A nice
feature is the usage of a [standardized
format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for
8 years ago
exercises. Kattis is primarily used by programming contest organizers, company
recruiters and also some universities.
# Analysis
None of the existing projects we came across fulfills all the requested features
for the new system. There is no grading system which supports arbitrary-length
evaluation pipeline, so we have to implement this feature ourselves, cautiously
treading through unexplored fields. Also, no existing solution is extensible
enough to be used as a base for the new system. After considering all these
facts, it is clear that a new system has to be written from scratch. This
implies that only a subset of all the features will be implemented in the first
version, the others coming in the following releases.
Gathered features are categorized based on priorities for the whole system. The
highest priority has main functionality similar to current CodEx. It is a base
line to be useful in production environment, but a new design allows to easily
develop further. On top of that, most of ideas from faculty staff belongs to
second priority bucket, which will be implemented as part of the project. The
most complicated tasks from this category are advanced low-level evaluation
configuration format, using modern tools, connecting to a university systems and
merging separate system instances into single one. Other tasks are scheduled for
next releases after successful project defense. Namely, these are high-level
exercise evaluation configuration with user-friendly interface for common
exercise types, SIS integration (when some API will be available from their
side) and command-line submit tool. Plagiarism detection is not likely to be
part of any release in near future unless someone other makes the engine. The
detection problem is too hard to be solved as part of this project.
We named the new project **ReCodEx -- ReCodEx Code Examiner**. The name
should point to the old CodEx, but also reflect the new approach to solve
issues. **Re** as part of the name means redesigned, rewritten, renewed, or
restarted.
At this point there is a clear idea how the new system will be used and what are
the major enhancements for future releases. With this in mind, the overall
architecture can be sketched. To sum up, here is a list of key features of the
new system. They come from previous research of current system's drawbacks,
reasonable wishes of university users and our major design choices.
- modern HTML5 web frontend written in JavaScript using a suitable framework
- REST API communicating with database, evaluation backend and a file server
- evaluation backend implemented as a distributed system on top of a message
queue framework with master-worker architecture
- multi-platform worker supporting Linux and Windows environment (latter
without sandbox, no general purpose suitable tool available yet)
- evaluation procedure configured in a human readable text file, compound of
small tasks connected into an arbitrary oriented acyclic graph
The reasons supporting these decisions are explained in the rest of analysis
chapter. Also a lot of smaller design choices are mentioned including possible
options, what is picked to implement and why. But first, discuss basic concepts
of the system.
8 years ago
## Basic concepts
The system is designed as a web application. The requirements say that the user
interface must be accessible from students' computers without the need to
install additional software. This immediately implies that users have to be
connected to the internet, so it is used as communication medium. Today, there
are two main ways of designing graphical user interface -- as a native
application or a web page. Creating a nice and multi-platform application with
graphical interface is almost impossible because of the large number of
different environments. Also, these applications often requires installation or
at least downloading its files (sources or binaries). On the other hand,
distributing a web application is easier, because every personal computer has an
internet browser installed. Also, browsers support an (mostly) unified and
standardized environment of HTML5 and JavaScript. CodEx is also a web
application and everybody seems satisfied with it. There are other communicating
channels most programmers have available, such as e-mail or git, but they are
inappropriate for designing user interfaces on top of them.
The application interacts with users. From the project assignment it is clear,
that the system has to keep personalized data about users and adapt presented
content according to this knowledge. User data cannot be publicly visible, so
that implies necessity of user authentication. The application also has to
support multiple ways of authentication (university authentication systems, a
company LDAP server, an OAuth server...) and permit adding more security
measures in the future, such as two-factor authentication.
User data also includes a privilege level. From the assignment it is required to
have at least two roles, _student_ and _supervisor_. However, it is wise to add
_administrator_ level, which takes care of the system as a whole and is
responsible for core setup, monitoring, updates and so on. Student role has the
least power, basically can just view assignments and submit solutions.
Supervisors have more authority, so they can create exercises and assignments,
view results of students etc. From the university organization, one possible
level could be introduced, _course guarantor_. However, from real experience all
8 years ago
duties related with lecturing of labs are already associated with supervisors,
so this role seems not so useful. In addition, no one requested more than three
level privilege scheme.
8 years ago
School labs are lessons for some students lead by supervisors. Students have the
same homework and supervisors are evaluating its solutions. This organization
has to be carried into the new system. Counterpart to real labs are virtual
groups. This concept was already discussed in previous chapter including need
for hierarchical structure of groups. Right for attending labs has only a
person, who is student of the university and is recorded in university
information system. To allow restriction of group members in ReCodEx, there two
type of groups -- _public_ and _private_. Public groups are open for every
registered users, but to become a member of private group one of its supervisors
have to add that user. This could be done automatically at beginning of the term
with data from information system, but unfortunately there is no such API yet.
However, creating this API is now considered by university leadership. Another
just as good solution for restricting membership of a group is to allow anyone
join the group with supplementary confirmation of supervisors. It has no
additional benefits, so approach with public and private groups is implemented.
Supervisors using CodEx in their labs usually set minimum amount of points
required to get a credit. These points can be get by solving assigned exercises.
8 years ago
To visually show users if they already have enough points, ReCodEx groups
supports setting this limit. There are two equal ways how to set a limit --
absolute value or relative value to maximum. The latter way seems nicer, so it
is implemented. The relative value is set in percents and is called threshold.
Our university has a few partner grammar schools. There were an idea, that they
could use CodEx for teaching informatics classes. To make the setup simple for
them, all the software and hardware would be provided by the university as a
completely ready-to-use remote service. However, CodEx were not prepared to
support this kind of usage and no one had time to manage a separate instance.
With ReCodEx it is possible to offer hosted environment as a service to other
subjects. The concept we figured out is based on user and group separation
inside the system. There are multiple _instances_ in the system, which means
unit of separation. Each instance has own set of users and groups, exercises can
be optionally shared. Evaluation backend is common for all instances. To keep
track of active instances and paying customers, each instance must have a valid
_licence_ to allow users submit their solutions. licence is granted for defined
period of time and can be revoked in advance if the subject do not keep approved
terms and conditions.
8 years ago
The main work for the system is to evaluate programming exercises. The exercise
is quite similar to homework assignment during school labs. When a homework is
assigned, two things are important to know for users:
- description of the problem
- metadata -- when and whom to submit solutions, grading scale, penalties, etc.
To reflect this idea teachers and students are already familiar with, we decided
to keep separation between problem itself (_exercise_) and its _assignment_.
Exercise only describes one problem and provides testing data with description
of how to evaluate it. In fact, it is template for assignments. Assignment then
contains data from its exercise and additional metadata, which can be different
for every assignment of the same exercise. This separation is natural for all
users, in CodEx it is implemented in similar way and no other considerable
solution was found.
8 years ago
8 years ago
### Evaluation unit executed by ReCodEx
8 years ago
One of the bigger requests for the new system is to support a complex
configuration of execution pipeline. The idea comes from lecturers of Compiler
principles class who want to migrate their semi-manual evaluation process to
CodEx. Unfortunately, CodEx is not capable of such complicated exercise setup.
None of evaluation systems we found can handle such task, so design from
scratch is needed.
8 years ago
There are two main approaches to design a complex execution configuration. It
can be composed of small amount of relatively big components or much more small
tasks. Big components are easy to write and whole configuration is reasonably
small. The components are designed for current problems, so it is not scalable
enough for pleasant future usage. This can be solved by introducing small set of
single-purposed tasks which can be composed together. The whole configuration is
then quite bigger, but with great adaptation ability for new conditions and also
less amount of work programming them. For better user experience, configuration
generators for some common cases can be introduced.
8 years ago
ReCodEx target is to be continuously developed and used for many years, so the
smaller tasks are the right choice. Observation of CodEx system shows that
only a few tasks are needed. In extreme case, only one task is enough -- execute
a binary. However, for better portability of configurations along different
systems it is better to implement reasonable subset of operations directly
without calling system provided binaries. These operations are copy file, create
new directory, extract archive and so on, altogether called internal tasks.
Another benefit from custom implementation of these tasks is guarantied safety,
so no sandbox needs to be used as in external tasks case.
For a job evaluation, the tasks needs to be executed sequentially in a specified
order. The idea of running independent tasks in parallel is bad because exact
time measurement needs controlled environment on target computer with
minimization of interrupts by other processes. It would be possible to run tasks
which does not need exact time measuremet in parallel, but in this case a
synchronization mechanism has to be developed to exclude paralellism for
measured tasks. Usually, there are about four times more unmeasured tasks than
tasks with time measurement, but measured tasks tends to be much longer. With
[Amdahl's law](https://en.wikipedia.org/wiki/Amdahl's_law) in mind, the
parallelism seems not to provide a huge benefit in overall execution speed and
brings troubles with synchronization. However, it there will be speed issues,
this approach could be reconsiderred.
It seems that connecting tasks into directed acyclic graph (DAG) can handle all
possible problem cases. None of the authors, supervisors and involved faculty
staff can think of a problem that cannot be decomposed into tasks connected in a
DAG. The goal of evaluation is to satisfy as many tasks as possible. During
execution there are sometimes multiple choices of next task. To control that,
each task can have a priority, which is used as a secondary ordering criterion.
For better understanding, here is a small example.
8 years ago
![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png)
The _job root_ task is imaginary single starting point of each job. When the
_CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should
be deterministic by position in configuration file -- tasks stated earlier
should be executed earlier). The task priorities guaranties, that after
_CompileA_ task all dependent tasks are executed before _CompileB_ task (they
have higher priority number). To sum up, connection of tasks represents
dependencies and priorities can be used to order unrelated tasks and with this
provide a total ordering of them. For well written jobs the priorities may not
be so useful, but they can help control execution order for example to avoid
situation, where each test of the job generates large temporary file and there
is a one valid execution order which keeps all the temporary files for later
processing at one time. Better approach is to finish execution of one test,
clean the big temporary file and proceed with following test. If there is an
ambiguity in task ordering at this point, they are executed in order of input
task configuration.
The total linear ordering of tasks can be done easier with just executing them
in order of input configuration. But this structure cannot handle well cases,
when a task fails. There is not a easy and nice way how to tell which task
should be executed next. However, this issue can be solved with graph structured
dependencies of the tasks. In graph structure, it is clear that all dependent
tasks has to be skipped and continue execution with a non related task. This is
the main reason, why the tasks are connected in a DAG.
8 years ago
For grading there are several important tasks. First, tasks executing submitted
code need to be checked for time and memory limits. Second, outputs of judging
tasks need to be checked for correctness (represented by return value or by data
on standard output) and should not fail. This division can be transparent for
backend, each task is executed the same way. But frontend must know which tasks
from whole job are important and what is their kind. It is reasonable, to keep
this piece of information alongside the tasks in job configuration, so each task
can have a label about its purpose. Unlabeled tasks have an internal type
_inner_. There are four categories of tasks:
8 years ago
- _initiation_ -- setting up the environment, compiling code, etc.; for users
failure means error in their sources which are not compatible with running it
with examination data
- _execution_ -- running the user code with examination data, must not exceed
time and memory limits; for users failure means wrong design, slow data
structures, etc.
- _evaluation_ -- comparing user and examination outputs; for user failure means
that the program does not compute the right results
- _inner_ -- no special meaning for frontend, technical tasks for fetching and
copying files, creating directories, etc.
Each job is composed of multiple tasks of these types which are semantically
grouped into tests. A test can represent one set of examination data for user
code. To mark the grouping, another task label can be used. Each test must have
exactly one _evaluation_ task (to show success or failure to users) and
arbitrary number of tasks with other types.
### Evaluation progress state
8 years ago
Users surely want to know progress state of their submitted solution this kind
of functionality comes particularly handy in long duration exercises. Because of
reporting progress users have immediate knowledge if anything goes wrong, not
mention psychological effect that whole system and its parts are working and
doing something. That is why this feature was considered from beginning but
there are multiple ways how to look at it in particular.
The very first idea would be to provide progress state based on done messages
from compilation, execution and evaluation. Which is something what a lot of
evaluation systems are providing. These information are high level enough for
users and they probably know what is going on and executing right now. If
compilation fails users know that their solution is not compilable, if execution
fails there were some problems with their program. The clarity of this kind of
progress state is nice and understandable. But as we learnt ReCodEx has to have
more advanced execution pipeline there can be more compilations or more
executions. And in addition parts of the system which ensure execution of users
solutions do not have to precisely know what they are executing at the moment.
This kind of information may be meaningless for them.
That is why another solution of progress state was considered. As we know right
now one of the best ways how to ensure generality is to have jobs with
single-purpose tasks. These tasks can be anything, some internal operation or
execution of external and sandboxed program. Based on this there is one very
simple solution how to provide general progress state which should be
independent on task types. We know that job has some number of tasks which has
to be executed so we can send state info after execution of every task. And that
is how we get percentual completion of an execution. Yes, it is kind of boring
and standard way but on top of that there can be built something else and more
appealing to users.
So displaying progress to users can be done numerous ways. We have percentual
completion which is of course begging for simple solution which is displaying
only the percentage or some kind of standard graphical progress bar. But that is
too mainstream lets try something else. Very good idea is to have some kind of
puzzled image or images which will be composed together according to progress.
Nice way but kind of challenging if we do not have designer around. Another
original solution is to have database of random kind-of-funny statements which
will be displayed every time task is completed. It is easy enough for
implementation and even for making up these messages and it is quite new and
original. That is why this last solution was chosen for displaying progress
state.
8 years ago
### Results of evaluation
8 years ago
There are lot of things which deserves discussion concerning results of
evaluation, how they should be displayed, what should be visible or not and also
8 years ago
what kind of reward for users solutions should be chosen.
8 years ago
At first let us focus on all kinds of outputs from executed programs within job.
Out of discussion is that supervisors should be able to view almost all outputs
from solutions if they choose them to be visible and recorded. This feature is
critical in debugging either whole exercises or users solutions. Supervisor
should have a choice to turn on preserving the data while the default behaviour
is to discard them to keep a file base around whole ReCodEx system in sensible
limits.
More interesting question is if students should see the logs from execution of
their solution. Usual approach is to keep these information private because of
possibility of leaking input data. This may lead students to hack their
solutions to pass just the ReCodEx testing cases instead of properly solving the
assigned problem. Martin Mareš strongly recommended to use this strategy of
hiding sensitive data too, so ReCodEx does. One exception are compilation
outputs which can help students a lot during troubleshooting. These logs shall
be visible unless the supervisor decides otherwise. Note, that due to lack of
frontend developers, this feature was not implemented in the very first release
of ReCodEx, but will be definitely available in the future.
8 years ago
The overall concept of grading solutions was presented earlier. To briefly
remind that, backend returns only exact measured values (used time and memory,
return code of the judging task, ...) and on top of that one value is computed.
The way of this computation can be very different across supervisors, so it has
to be easily extendable. The best way is to provide interface, which can be
implemented and any sort of magic can return the final value.
We found out several computational possibilities. There is basic arithmetic,
weighted arithmetic, geometric and harmonic mean of results of each test (the
8 years ago
result is logical value succeeded/failed, optionally with weight), some kind of
8 years ago
interpolation of used amount of time for each test, the same with used memory
amount and surely many others. To keep the project simple, we decided to design
8 years ago
appropriate interface and implement only weighted arithmetic mean computation,
which is used in about 90% of all assignments. Of course, different scheme can
be chosen for every assignment and also can be configured -- for example
specifying test weights for implemented weighted arithmetic mean. Advanced ways
of computation can be implemented on demand when there is a real demand for
them.
8 years ago
To avoid assigning points for insufficient solutions (like only printing "File
error" which is the valid answer in two tests), a minimal point threshold can be
8 years ago
specified. It the solution is to get less points than specified, it will get
zero points instead. This functionality can be embedded into grading computation
8 years ago
algoritm itself, but it would have to be present in each implementation
separately, which is a bit ugly. So, this feature is separated from point
computation.
Automatic grading cannot reflect all aspects of submitted code. For example,
structuring the code, number and quality of comments and so on. To allow
supervisors bring these manually checked things into grading, there is a concept
8 years ago
of bonus points. They can be positive or negative. Generally the solution with
the most assigned points is marked for grading that particular assignment.
8 years ago
However, if supervisor is not satisfied with student solution (really bad code,
8 years ago
cheating, ...) he/she assigns the student negative bonus points. To prevent
overriding this decision by system choosing another solution with more points or
even student submitting the same code again which evaluates to more points,
supervisor can mark a particular solution as marked and used for grading instead
of solution with the most points.
8 years ago
8 years ago
### Persistence
8 years ago
Previous parts of analysis show that the system has to keep some state. This
could be user settings, group membership, evaluated assignments and so on. The
data have to be kept across restart, so persistence is important decision
factor. There are several ways how to save structured data:
- plain files
- NoSQL database
- relational database
Another important factor is amount and size of stored data. Our guess is about
1000 users, 100 exercises, 200 assignments per year and 200000 unique solutions
8 years ago
per year. The data are mostly structured and there are a lot of them with the
same format. For example, there is a thousand of users and each one has the same
values -- name, email, age, etc. These kind of data are relatively small, name
and email are short strings, age is an integer. Considering this, relational
databases or formatted plain files (CSV for example) fits best for them.
However, the data often have to support find operation, so they have to be
sorted and allow random access for resolving cross references. Also, addition a
8 years ago
deletion of entries should take reasonable time (at most logarithmic time
8 years ago
complexity to number of saved values). This practically excludes plain files, so
8 years ago
relational database is used instead.
On the other hand, there are some data with no such great structure and much
larger size. These can be evaluation logs, sample input files for exercises or
8 years ago
submitted sources by students. Saving this kind of data into relational database
8 years ago
is not suitable, but it is better to keep them as ordinary files or store them
into some kind of NoSQL database. Since they are already files and does not need
to be backed up in multiple copies, it is easier to keep them as ordinary files
in filesystem. Also, this solution is more lightweight and does not require
additional dependencies on third-party software. File can be identified using
its filesystem path or unique index stored as value in relational database. Both
8 years ago
approaches are equally good, final decision depends on actual case.
## Structure of the project
There are numerous ways how to divide some sort of system into separated
services, from one single component to many and many single-purpose components.
Having only one big service is not feasible, not scalable enough and mainly it
would be one big blob of code which somehow works and is very complex, so this
is not the way. The quite opposite, having a lot of single-purpose components is
also somehow impractical. It is scalable by default and all services would have
quite simple code but on the other hand communication requirements for such
solution would be insane. So there has to be chosen approach which is somehow in
the middle, that means services have to communicate in manner which will not
bring network down, code basis should be reasonable and the whole system has to
be scalable enough. With this being said there can be discussion over particular
division for ReCodEx system.
The ReCodEx project is divided into two logical parts the *backend* and the
*frontend* which interact which each other and which cover the whole area of
code examination. Both of these logical parts are independent of each other in
the sense of being installed on separate machines at different locations and
that one of the parts can be replaced with a different implementation and as
long as the communication protocols are preserved, the system will continue
working as expected.
*Backend* is the part which is responsible solely for the process of evaluation
a solution of an exercise. Each evaluation of a solution is referred to as a
*job*. For each job, the system expects a configuration document of the job,
supplementary files for the exercise (e.g., test inputs, expected outputs,
predefined header files), and the solution of the exercise (typically source
codes created by a student). There might be some specific requirements for the
job, such as a specific runtime environment, specific version of a compiler or
the job must be evaluated on a processor with a specific number of cores. The
backend infrastructure decides whether it will accept a job or decline it based
on the specified requirements. In case it accepts the job, it will be placed in
a queue and it will be processed as soon as possible. The backend publishes the
progress of processing of the queued jobs and the results of the evaluations can
be queried after the job processing is finished. The backend produces a log of
the evaluation and scores the solution based on the job configuration document.
From the scalable point of view there are two necessary components, the one
which will execute jobs and component which will distribute jobs to the
instances of the first one. This ensures scalability in manner of parallel
execution of numerous jobs which is exactly what is needed. Implementation of
these services are called **broker** and **worker**, first one handles
8 years ago
distribution, latter execution. These components should be enough to fulfill all
above said, but for the sake of simplicity and better communication gateways
with frontend two other components were added, **fileserver** and **monitor**.
Fileserver is simple component whose purpose is to store files which are
exchanged between frontend and backend. Monitor is also quite simple service
which is able to serve job progress state from worker to web application. These
two additional services are on the edge of frontend and backend (like gateways)
but logically they are more connected with backend, so it is considered they
belong there.
*Frontend* on the other hand is responsible for the communication with the users
and provides them a convenient access to the backend infrastructure. The
frontend manages user accounts and gathers them into units called groups. There
is a database of exercises which can be assigned to the groups and the users of
these groups can submit their solutions for these assignments. The frontend will
initiate evaluation of these solutions by the backend and it will store the
results afterwards. The results will be visible to authorized users and the
results will be awarded with points according to the score given by the backend
in the evaluation process. The supervisors of the groups can edit the parameters
of the assignments, review the solutions and the evaluations in detail and award
the solutions with bonus points (both positive and negative) and discuss about
the solution with the author of the solution. Some of the users can be entitled
to create new exercises and extend the database of exercises which can be
assigned to the groups later on.
There are two main purposes of frontend -- holding the state of whole system
(database of users, exercises, solutions, points, etc.) and presenting the state
to users through some kind of an user interface (e.g., a web application, mobile
application, or a command-line tool). According to contemporary trends in
development of frontend parts of applications, we decided to split the frontend
in two logical parts -- a server side and a client side. The server side is
responsible for managing the state and the client side gives instructions to the
server side based on the inputs from the user. This decoupling gives us the
ability to create multiple client side tools which may address different needs
of the users.
The frontend developed as part of this project is a web application created with
the needs of the Faculty of Mathematics and Physics of the Charles university in
Prague in mind. The users are the students and their teachers, groups correspond
to the different courses, the teachers are the supervisors of these groups. We
believe that this model is applicable to the needs of other universities,
schools, and IT companies, which can use the same system for their needs. It is
also possible to develop their own frontend with their own user management
system for their specific needs and use the possibilities of the backend without
any changes, as was mentioned in the previous paragraphs.
8 years ago
One possible configuration of ReCodEx system is illustrated on following
picture, where there is one shared backend with three workers and two separate
instances of whole frontend. This configuration may be suitable for MFF UK --
basic programming course and KSP competition. But maybe even sharing web API and
8 years ago
fileserver with only custom instances of client (web app or own implementation)
is more likely to be used. Note, that connections between components are not
fully accurate.
![Overall architecture](https://github.com/ReCodEx/wiki/blob/master/images/Overall_Architecture.png)
In the latter parts of the documentation, both of the backend and frontend parts
8 years ago
will be introduced separately and covered in more detail. The communication
protocol between these two logical parts will be described as well.
8 years ago
## Implementation analysis
8 years ago
When developing a project like ReCodEx there has to be some discussion over
8 years ago
implementation details and how to solve some particular problems properly. This
discussion is a never ending story which goes on through the whole development
process. Some of the most important implementation problems or interesting
observations will be discussed in this chapter.
8 years ago
### General communication
8 years ago
Overall design of the project is discussed above. There are bunch of components
with their own responsibility. Important thing to design is communication of
these components. All we can count with is that they are connected by network.
To choose a suitable protocol, there are some additional requirements that
should be met:
- reliability -- if a message is sent between components, the protocol has to
ensure that it is received by target component
- working over IP protocol
- multi-platform and multi-language usage
TCP/IP protocol meets these conditions, however it is quite low level and
working with it usually requires working with platform dependent non-object API.
Often way to reflect these reproaches is to use some framework which provides
better abstraction and more suitable API. We decided to go this way, so the
following options are considered:
- CORBA -- Corba is a well known framework for remote object invocation. There
are multiple implementations for almost every known programming language. It
fits nicely into object oriented programming environment.
- RabbitMQ -- RabbitMQ is a messaging framework written in Erlang. It has
bindings to huge number of languages and large community. Also, it is capable
of routing requests, which could be handy feature for job loadbalancing.
- ZeroMQ -- ZeroMQ is another messaging framework, but instead of creating
separate service this is a small library which can be embedded into own
projects. It is written in C++ with huge number of bindings.
We like CORBA, but our system should be more loosely-coupled, so (asynchronous)
messaging is better approach in our minds. RabbitMQ seems nice with great
advantage of routing capability, but it is quite heavy service written in
language no one from the team knows, so we do not like it much. ZeroMQ is the
best option for us. However, all of the three options would have been possible
to use.
Frontend communication follows the choice, that ReCodEx should be primary a web
application. The communication protocol has to reflect client-server
architecture. There are several options:
- *TCP sockets* -- TCP sockets give a reliable means of a full-duplex
communication. All major operating systems support this protocol and there are
libraries which simplify the implementation. On the other side, it is not
possible to initiate a TCP socket from a web browser.
- *WebSockets* -- The WebSocket standard is built on top of TCP. It enables a
web browser to connect to a server over a TCP socket. WebSockets are
implemented in recent versions of all modern web browsers and there are
libraries for several programming languages like Python or JavaScript (running
in Node.js). Encryption of the communication over a WebSocket is supported as
a standard.
- *HTTP protocol* -- The HTTP protocol is a state-less protocol implemented on
top of the TCP protocol. The communication between the client and server
consists of a requests sent by the client and responses to these requests sent
back by the sever. The client can send as many requests as needed and it may
ignore the responses from the server, but the server must respond only to the
requests of the client and it cannot initiate communication on its own.
End-to-end encryption can be achieved easily using SSL (HTTPS).
We chose the HTTP(S) protocol because of the simple implementation in all sorts
of operating systems and runtime environments on both the client and the server
side.
8 years ago
The API of the server should expose basic CRUD (Create, Read, Update, Delete)
operations. There are some options on what kind of messages to send over the
HTTP:
- SOAP -- a protocol for exchanging XML messages. It is very robust and complex.
- REST -- is a stateless architecture style, not a protocol or a technology. It
relies on HTTP (but not necessarily) and its method verbs (e.g., GET, POST,
PUT, DELETE). It can fully implement the CRUD operations.
Even though there are some other technologies we chose the REST style over the
HTTP protocol. It is widely used, there are many tools available for development
and testing, and it is understood by programmers so it should be easy for a new
developer with some experience in client-side applications to get to know with
the ReCodEx API and develop a client application.
8 years ago
8 years ago
To sum up, chosen ways of communication inside the ReCodEx system are captured
in the following image. Red connections are through ZeroMQ sockets, blue are
through WebSockets and green are through HTTP(S).
8 years ago
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
### Broker
8 years ago
8 years ago
The broker is responsible for keeping track of available workers and
distributing jobs that it receives from the frontend between them.
#### Worker management
It is intended for the broker to be a fixed part of the backend infrastructure
to which workers connect at will. Thanks to this design, workers can be added
and removed when necessary (and possibly in an automated fashion), without
changing the configuration of the broker. An alternative solution would be
configuring a list of workers before startup, thus making them passive in the
communication (in the sense that they just wait for incoming jobs instead of
connecting to the broker). However, this approach comes with a notable
administration overhead -- in addition to starting a worker, the administrator
would have to update the worker list.
Worker management must also take into account the possibility of worker
disconnection, either because of a network or software failure (or termination).
A common way to detect such events in distributed systems is to periodically
send short messages to other nodes and expect a response. When these messages
stop arriving, we presume that the other node encountered a failure. Both the
broker and workers can be made responsible for initiating these exchanges and it
seems that there are no differences stemming from this choice. We decided that
the workers will be the active party that initiates the exchange.
8 years ago
#### Scheduling
Jobs should be scheduled in a way that ensures that they will be processed
without unnecessary waiting. This depends on the fairness of the scheduling
algorithm (no worker machine should be overloaded).
The design of such scheduling algorithm is complicated by the requirements on
the diversity of workers -- they can differ in operating systems, available
software, computing power and many other aspects.
We decided to keep the details of connected workers hidden from the frontend,
which should lead to a better separation of responsibilities and flexibility.
Therefore, the frontend needs a way of communicating its requirements on the
machine that processes a job without knowing anything about the available
workers. A key-value structure is suitable for representing such requirements.
With respect to these constraints, and because the analysis and design of a more
sophisticated solution was declared out of scope of our project assignment, a
rather simple scheduling algorithm was chosen. The broker shall maintain a queue
of available workers. When assigning a job, it traverses this queue and chooses
the first machine that matches the requirements of the job. This machine is then
moved to the end of the queue.
Presented algorithm results in a simple round-robin load balancing strategy,
which should be sufficient for small-scale deployments (such as a single
university). However, with a large amount of jobs, some workers will easily
become overloaded. The implementation must allow for a simple replacement of the
load balancing strategy so that this problem can be solved in the near future.
#### Forwarding jobs
Information about a job can be divided in two disjoint parts -- what the worker
needs to know to process it and what the broker needs to forward it to the
correct worker. It remains to be decided how this information will be
transferred to its destination.
It is technically possible to transfer all the data required by the worker at
once through the broker. This package could contain submitted files, test
data, requirements on the worker, etc. A drawback of this solution is that
both submitted files and test data can be rather large. Furthermore, it is
likely that test data would be transferred many times.
Because of these facts, we decided to store data required by the worker using a
shared storage space and only send a link to this data through the broker. This
approach leads to a more efficient network and resource utilization (the broker
doesn't have to process data that it doesn't need), but also makes the job
submission flow more complicated.
#### Further requirements
The broker can be viewed as a central point of the backend. While it has only
two primary, closely related responsibilities, other requirements have arisen
(forwarding messages about job evaluation progress back to the frontend) and
will arise in the future. To facilitate such requirements, its architecture
should allow simply adding new communication flows. It should also be as
asynchronous as possible to enable efficient communication with external
services, for example via HTTP.
### Worker
Worker is component which is supposed to execute incoming jobs from broker. As
such worker should work and support wide range of different infrastructures and
maybe even platforms/operating systems. Support of at least two main operating
systems is desirable and should be implemented. Worker as a service does not
have to be much complicated, but a bit of complex behaviour is needed. Mentioned
complexity is almost exclusively concerned about robust communication with
broker which has to be regularly checked. Ping mechanism is usually used for
this in all kind of projects. This means that worker should be able to send ping
messages even during execution. So worker has to be divided into two separate
parts, the one which will handle communication with broker and the another which
will execute jobs. The easiest solution is to have these parts in separate
8 years ago
threads which somehow tightly communicates with each other. For inter process
8 years ago
communication there can be used numerous technologies, from shared memory to
condition variables or some kind of in-process messages. Already used library
ZeroMQ is possible to provide in-process messages working on the same principles
as network communication which is quite handy and solves problems with threads
synchronization and such.
8 years ago
At this point we have worker with two internal parts listening one and execution
8 years ago
one. Implementation of first one is quite straightforward and clear. So lets
8 years ago
discuss what should be happening in execution subsystem. Jobs as work units can
quite vary and do completely different things, that means configuration and
worker has to be prepared for this kind of generality. Configuration and its
solution was already discussed above, implementation in worker is then quite
also quite straightforward. Worker has internal structures to which loads and
which stores metadata given in configuration. Whole job is mapped to job
metadata structure and tasks are mapped to either external ones or internal ones
(internal commands has to be defined within worker), both are different whether
they are executed in sandbox or as internal worker commands.
Another division of tasks is by task-type field in configuration. This field can
have four values: initiation, execution, evaluation and inner. All was discussed
and described above in configuration analysis. What is important to worker is
how to behave if execution of task with some particular type fails. There are
two possible situations execution fails due to bad user solution or due to some
internal error. If execution fails on internal error solution cannot be declared
overly as failed. User should not be punished for bad configuration or some
network error. This is where task types are useful. Generally initiation,
execution and evaluation are tasks which are somehow executing code which was
given by users who submitted solution of exercise. If this kinds of tasks fail
it is probably connected with bad user solution and can be evaluated. But if
some inner task fails solution should be re-executed, in best case scenario on
different worker. That is why if inner task fails it is sent back to broker
which will reassign job to another worker. More on this subject should be
discussed in broker assigning algorithms section.
There is also question about working directory or directories of job, which
directories should be used and what for. There is one simple answer on this
every job will have only one specified directory which will contain every file
with which worker will work in the scope of whole job execution. This is of
course nonsense there has to be some logical division. The least which must be
done are two folders one for internal temporary files and second one for
evaluation. The directory for temporary files is enough to comprehend all kind
of internal work with filesystem but only one directory for whole evaluation is
somehow not enough. Users solutions are downloaded in form of zip archives so
why these should be present during execution or why the results and files which
should be uploaded back to fileserver should be cherry picked from the one big
directory? The answer is of course another logical division into subfolders. The
solution which was chosen at the end is to have folders for downloaded archive,
decompressed solution, evaluation directory in which user solution is executed
and then folders for temporary files and for results and generally files which
should be uploaded back to fileserver with solution results. Of course there has
to be hierarchy which separate folders from different workers on the same
machines. That is why paths to directories are in format:
`${DEFAULT}/${FOLDER}/${WORKER_ID}/${JOB_ID}` where default means default
working directory of whole worker, folder is particular directory for some
purpose (archives, evaluation, ...). Mentioned division of job directories
proved to be flexible and detailed enough, everything is in logical units and
where it is supposed to be which means that searching through this system should
be easy. In addition if solutions of users have access only to evaluation
directory then they do not have access to unnecessary files which is better for
overall security of whole ReCodEx.
8 years ago
As we discovered above worker has job directories but users who are writing and
managing job configurations do not know where they are (on some particular
worker) and how they can be accessed and written into configuration. For this
kind of task we have to introduce some kind of marks or signs which will
represent particular folders. Marks or signs can have form of some kind of
special strings which can be called variables. These variables then can be used
8 years ago
everywhere where filesystem paths are used within configuration file. This will
8 years ago
solve problem with specific worker environment and specific hierarchy of
directories. Final form of variables is `${...}` where triple dot is textual
8 years ago
description. This format was used because of special dollar sign character which
cannot be used within filesystem path, braces are there only to border textual
description of variable.
8 years ago
#### Evaluation
8 years ago
After successful arrival of job, worker has to prepare new execution
environment, then solution archive has to be downloaded from fileserver and
extracted. Job configuration is located within these files and loaded into
internal structures and executed. After that results are uploaded back to
fileserver. These steps are the basic ones which are really necessary for whole
execution and have to be executed in this precise order.
Interesting problem is with supplementary files (inputs, sample outputs). There
are two approaches which can be observed. Supplementary files can be downloaded
either on the start of the execution or during execution. If the files are
downloaded at the beginning execution does not really started at this point and
if there are problems with network worker find it right away and can abort
execution without executing single task. Slight problems can arise if some of
the files needs to have same name (e.g. solution assumes that input is
`input.txt`), in this scenario downloaded files cannot be renamed at the
beginning but during execution which is somehow impractical and not easily
observed. Second solution of this problem when files are downloaded on the fly
has quite opposite problem, if there are problems with network worker will find
it during execution when for instance almost whole execution is done, this is
also not ideal solution if we care about burnt hardware resources. On the other
hand using this approach users have quite advanced control of execution flow and
know what files exactly are available during execution which is from users
perspective probably more appealing then the first solution. Based on that
downloading of supplementary files using 'fetch' tasks during execution was
chosen and implemented.
8 years ago
#### Caching mechanism
Worker can use caching mechanism based on files from fileserver under one
condition, provided files has to have unique name. If uniqueness is fulfilled
then precious bandwidth can be saved using cache. This means there has to be
system which can download file, store it in cache and after some time of
inactivity delete it. Because there can be multiple worker instances on some
particular server it is not efficient to have this system in every worker on its
own. So it is feasible to have this feature somehow shared among all workers on
8 years ago
the same machine. Solution may be again having separate service connected
through network with workers which would provide such functionality but this
8 years ago
would mean component with another communication for the purpose where it is not
exactly needed. But mainly it would be single-failure component if it would stop
working it is quite problem. So there was chosen another solution which assumes
worker has access to specified cache folder, to this folder worker can download
supplementary files and copy them from here. This means every worker has the
possibility to maintain downloads to cache, but what is worker not able to
8 years ago
properly do is deletion of unused files after some time. For that single-purpose
component is introduced which is called 'cleaner'. It is simple script executed
within cron which is able to delete files which were unused for some time.
Together with worker fetching feature cleaner completes machine specific caching
system.
8 years ago
Cleaner as mentioned is simple script which is executed regularly as cron job.
If there is caching system like it was introduced in paragraph above there are
little possibilities how cleaner should be implemented. On various filesystems
there is usually support for two particular timestamps, `last access time` and
`last modification time`. Files in cache are once downloaded and then just
copied, this means that last modification time is set only once on creation of
file and last access time should be set every time on copy. This imply last
access time is what is needed here. But last modification time is widely used by
operating systems, on the other hand last access time is not by default. More on
this subject can be found
[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime).
For proper cleaner functionality filesystem which is used by worker for caching
has to have last access time for files enabled.
8 years ago
8 years ago
Having cleaner as separated component and caching itself handled in worker is
kind of blurry and is not clearly observable that it works without any race
conditions. The goal here is not to have system without races but to have system
which can recover from them. Implementation of caching system is based upon
atomic operations of underlying filesystem. Follows description of one possible
robust implementation. First start with worker implementation:
8 years ago
- worker discovers fetch task which should download supplementary file
8 years ago
- worker takes name of file and tries to copy it from cache folder to its
working folder
- if successful then last access time should be rewritten (by filesystem
itself) and whole operation is done
- if not successful then file has to be downloaded
- file is downloaded from fileserver to working folder
- downloaded file is then copied to cache
Previous implementation is only within worker, cleaner can anytime intervene and
delete files. Implementation in cleaner follows:
- cleaner on its start stores current reference timestamp which will be used for
comparison and load configuration values of caching folder and maximal file
age
- there is a loop going through all files and even directories in specified
cache folder
- last access time of file or folder is detected
8 years ago
- last access time is subtracted from reference timestamp into
difference
- difference is compared against specified maximal file age, if
difference is greater, file or folder is deleted
Previous description implies that there is gap between detection of last access
time and deleting file within cleaner. In the gap there can be worker which will
access file and the file is anyway deleted but this is fine, file is deleted but
worker has it copied. Another problem can be with two workers downloading the
same file, but this is also not a problem file is firstly downloaded to working
folder and after that copied to cache. And even if something else unexpectedly
fails and because of that fetch task will fail during execution even that should
be fine. Because fetch tasks should have 'inner' task type which implies that
fail in this task will stop all execution and job will be reassigned to another
worker. It should be like the last salvation in case everything else goes wrong.
8 years ago
#### Sandboxing
8 years ago
There are numerous ways how to approach sandboxing on different platforms,
describing all possible approaches is out of scope of this document. Instead of
that have a look at some of the features which are certainly needed for ReCodEx
8 years ago
and propose some particular sandboxes implementations on Linux or Windows.
8 years ago
8 years ago
General purpose of sandbox is safely execute software in any form, from scripts
to binaries. Various sandboxes differ in how safely are they and what limiting
features they have. Ideal situation is that sandbox will have numerous options
and corresponding features which will allow administrators to setup environment
as they like and which will not allow user programs to somehow damage executing
machine in any way possible.
For ReCodEx and its evaluation there is need for at least these features:
execution time and memory limitation, disk operations limit, disk accessibility
restrictions and network restrictions. All these features if combined and
implemented well are giving pretty safe sandbox which can be used for all kinds
of users solutions and should be able to restrict and stop any standard way of
attacks or errors.
Linux systems have quite extent support of sandboxing in kernel, there were
introduced and implemented kernel namespaces and cgroups which combined can
limit hardware resources (cpu, memory) and separate executing program into its
own namespace (pid, network). These two features comply sandbox requirement for
ReCodEx so there were two options, either find existing solution or implement
new one. Luckily existing solution was found and its name is **isolate**.
Isolate does not use all possible kernel features but only subset which is still
enough to be used by ReCodEx.
The opposite situation is in Windows world, there is limited support in its
kernel which makes sandboxing a bit trickier. Windows kernel only has ways how
to restrict privileges of a process through restriction of internal access
tokens. Monitoring of hardware resources is not possible but used resources can
be obtained through newly created job objects. But find sandbox which can do all
things needed for ReCodEx seems to be impossible. There are numerous sandboxes
for Windows but they all are focused on different things in a lot of cases they
serves as safe environment for malicious programs, viruses in particular. Or
they are designed as a separate filesystem namespace for installing a lot of
temporarily used programs. From all these we can mention Sandboxie, Comodo
Internet Security, Cuckoo sandbox and many others. None of these is fitted as
sandbox solution for ReCodEx. With this being said we can safely state that
designing and implementing new general sandbox for Windows is out of scope of
this project.
8 years ago
New general sandbox for Windows is out of business but what about more
8 years ago
specialized solution used for instance only for C#. CLR as a virtual machine and
runtime environment has a pretty good security support for restrictions and
8 years ago
separation which is also transferred to C#. This makes it quite easy to
implement simple sandbox within C# but surprisingly there cannot be found some
well known general purpose implementations. As said in previous paragraph
implementing our own solution is out of scope of project there is simple not
enough time. But C# sandbox is quite good topic for another project for example
term project for C# course so it might be written and integrated in future.
### Fileserver
The fileserver provides access to a shared storage space that contains files
submitted by students, supplementary files such as test inputs and outputs and
results of evaluation. In other words, it acts as an intermediate node for data
passed between the frontend and the backend. This functionality can be easily
separated from the rest of the backend features, which led to designing the
fileserver as a standalone component. Such design helps encapsulate the details
of how the files are stored (e.g. on a file system, in a database or using a
cloud storage service), while also making it possible to share the storage
between multiple ReCodEx frontends.
For early releases of the system, we chose to store all files on the file system
-- it is the least complicated solution (in terms of implementation complexity)
and the storage backend can be rather easily migrated to a different technology.
One of the facts we learned from CodEx is that many exercises share test input
and output files, and also that these files can be rather large (hundreds of
megabytes). A direct consequence of this is that we cannot add these files to
submission archives that are to be downloaded by workers -- the combined size of
the archives would quickly exceed gigabytes, which is impractical. Another
conclusion we made is that a way to deal with duplicate files must be
introduced.
A simple solution to this problem is storing supplementary files under the
hashes of their content. This ensures that every file is stored only once. On
the other hand, it makes it more difficult to understand what the content of a
file is at a glance, which might prove problematic for the administrator.
A notable part of the fileserver's work is done by a web server (e.g. listening
to HTTP requests and caching recently accessed files in memory for faster
access). What remains to be implemented is handling requests that upload files
-- student submissions should be stored in archives to facilitate simple
downloading and supplementary exercise files need to be stored under their
hashes.
We decided to use Python and the Flask web framework. This combination makes it
possible to express the logic in ~100 SLOC and also provides means to run the
fileserver as a standalone service (without a web server), which is useful for
development.
### Monitor
8 years ago
8 years ago
Users want to view real time evaluation progress of their solution. It can be
easily done with established double-sided connection stream, but it is hard to
8 years ago
achieve with web technologies. HTTP protocol works differently on separate
8 years ago
requests basis with no long term connection. However, there is widely used
8 years ago
technology to solve this problem, WebSocket protocol.
Working with WebSocket protocol from the backend is possible, but not ideal from
design point of view. Backend should be hidden from public internet to minimize
surface for possible attacks. With this in mind, there are two possible options:
- send progress messages through API
- make separate component for progress messages
Each of the two possibilities has some pros and cons. The first one is good
8 years ago
because there is no additional component and API is already publicly visible. On
8 years ago
the other side, working with WebSocket protocol from PHP is not much pleasant
(but it is possible) and embedding this functionality into API is not
extendable. The second approach is better for future changing the protocol or
implementing extensions like caching of messages. Also, the progress feature is
considered only optional, because there may be clients for which this feature is
useless. Major drawback of separate component is another part, which needs to
be publicly exposed.
We decided to make a separate component, mainly because it is smaller component
with only one role, better maintainability and optional demands for progress
callback.
There are several possibilities how to write the component. Notably, considered
options were already used languages C++, PHP, JavaScript and Python. At the end,
the Python language was chosen for its simplicity, great support for all used
technologies and also there are free Python developers in out team. Then,
responsibility of this component is determined. Concept of message flow is on
following picture.
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
The message channel inputing the monitor uses ZeroMQ as main message framework
8 years ago
used by backend. This decision keeps rest of backend aware of used
8 years ago
communication protocol and related libraries. Output channel is WebSocket as a
protocol for sending messages to web browsers. In Python, there are several
WebSocket libraries. The most popular one is `websockets` in cooperation with
`asyncio`. This combination is easy to use and well documented, so it is used in
monitor component too. For ZeroMQ, there is `zmq` library with binding to
framework core in C++.
8 years ago
Incoming messages are cached for short period of time. Early testing shows,
8 years ago
that backend can start sending progress messages sooner than client connects to
the monitor. To solve this, messages for each job are hold 5 minutes after
reception of last message. The client gets all already received messages at time
of connection with no message loss.
### API server
The API server must handle HTTP requests and manage the state of the application
in some kind of a database. It must also be able to communicate with the backend
over ZeroMQ.
We considered several technologies which could be used:
- PHP + Apache -- one of the most widely used technologies for creating web
servers. It is a suitable technology for this kind of a project. It has all
the features we need when some additional extensions are installed (to support
LDAP or ZeroMQ).
- Ruby on Rails, Python (Django), etc. -- popular web technologies that appeared
in the last decade. Both support ZeroMQ and LDAP via extensions and have large
developer communities.
- ASP.NET (C#), JSP (Java) -- these technologies are very robust and are used to
8 years ago
create server technologies in many big enterprises. Both can run on Windows
and Linux servers (ASP.NET using the .NET Core).
- JavaScript (Node.js) -- it is a quite new technology and it is being used to
create REST APIs lately. Applications running on Node.js are quite performant
8 years ago
and the number of open-source libraries available on the Internet is very
huge.
We chose PHP and Apache mainly because we were familiar with these technologies
and we were able to develop all the features we needed without learning to use a
new technology. Since the number of features was quite high and needed to meet a
strict deadline. This does not mean that we would find all the other
technologies superior to PHP in all other aspects - PHP 7 is a mature language
8 years ago
with a huge community and a wide range of tools, libraries, and frameworks.
We decided to use an ORM framework to manage the database, namely the widely
used PHP ORM Doctrine 2. Using an ORM tool means we do not have to write SQL
queries by hand. Instead, we work with persistent objects, which provides a
higher level of abstraction. Doctrine also has a robust database abstraction
layer so the database engine is not very important and it can be changed without
any need for changing the code. MariaDB was chosen as the storage backend.
To speed up the development process of the PHP server application we decided to
use a web framework. After evaluating and trying several frameworks, such as
Lumen, Laravel, and Symfony, we ended up using Nette. This framework is very
common in Czech Republic -- its lead developer is a well-known Czech programmer
David Grudl -- and we were already familiar with the patterns used in this
framework, such as dependency injection, authentication, routing. These concepts
are useful even when developing a REST application, which might be a surprise
considering that Nette focuses on "traditional" web applications. There is also
8 years ago
a Nette extension which makes integration of Doctrine 2 very straightforward.
#### Architecture of the system
The Nette framework is an MVP (Model, View, Presenter) framework. It has many
tools for creating complex websites and we need only a subset of them or we use
different libraries which suite our purposes better:
- **Model** - the model layer is implemented using the Doctrine 2 ORM insead of
Nette Database
- **View** - the whole view layer of the Nette framework (e.g., the Latte engine
used for HTML template rendering) is unnecessary since we will return all the
responses encoded in JSON. JSON is a common format used in APIs and we decided
to prefer it to XML or a custom format.
- **Presenter** - the whole lifecycle of a request processing of the Nette
framework is used. The Presenters are used to group the logic of the individual
API endpoints. The routing mechanism is modified to distinguish the actions by
both the URL and the HTTP method of the request.
#### Request handling
A typical scenario for handling an API request is matching the HTTP request with
a corresponding handler routine which creates a response object, that is then
8 years ago
sent back to the client, encoded with JSON. The `Nette\Application` package can
be used to achieve this with Nette, although it is meant to be used mainly in
MVP applications.
Matching HTTP requests with handlers can be done using standard Nette URL
routing -- we will create a Nette route for each API endpoint. Using the routing
mechanism from Nette logically leads to implementing handler routines as Nette
Presenter actions. Each presenter should serve logically related endpoints.
8 years ago
The last step is encoding the response as JSON. In `Nette\Application`, HTTP
responses are returned using the `Presenter::sendResponse()` method. We decided
to write a method that calls `sendResponse` internally and takes care of the
encoding. This method has to be called in every presenter action. An alternative
approach would be using the internal payload object of the presenter, which is
more convenient, but provides us with less control.
#### Authentication
To make certain data and actions acessible only for some specific users, there
8 years ago
must be a way how these users can prove their identity. We decided to avoid PHP
sessions to make the server stateless (session ID is stored in the cookies of
the HTTP requests and responses). The server issues a specific token for the
user after his/her identity is verified (i.e., by providing email and password)
8 years ago
and sent to the client in the body of the HTTP response. The client must
remember this token and attach it to every following request in the
*Authorization* header.
The token must be valid only for a certain time period ("log out" the user after
8 years ago
a few hours of inactivity) and it must be protected against abuse (e.g., an
attacker must not be able to issue a token which will be considered valid by the
system and using which the attacker could pretend to be a different user). We
decided to use the JWT standard (the JWS).
The JWT is a base64-encoded string which contains three JSON documents - a
header, some payload, and a signature. The interesting parts are the payload and
the signature: the payload can contain any data which can identify the user and
metadata of the token (i.e., the time when the token was issued, the time of
expiration). The last part is a digital signature contains a digital signature
of the header and payload and it ensures that nobody can issue their own token
and steal someone's identity. Both of these characteristics give us the
opportunity to validate the token without storing all of the tokens in the
database.
To implement JWT in Nette, we have to implement some of its security-related
8 years ago
interfaces such as IAuthenticator and IUserStorage, which is rather easy thanks
to the simple authentication flow. Replacing these services in a Nette
application is also straightforward, thanks to its dependency injection
8 years ago
container implementation. The encoding and decoding of the tokens itself
8 years ago
including generating the signature and signature verification is done through a
widely used third-party library which lowers the risk of having a bug in the
implementation of this critical security feature.
#### Forgotten password
With authentication and some sort of dealing with passwords is related a problem
with forgotten credentials, especially passwords. People easily forget them and
there has to be some kind of mechanism to retrieve a new password or change the
old one. Problem is that it cannot be done in totally secure way, but we can at
least come quite close to it. First, there are absolutely not secure and
recommendable ways how to handle that, for example sending the old password
through email. A better, but still not secure solution is to generate a new one
and again send it through email. This solution was provided in CodEx, users had
to write an email to administrator, who generated a new password and sent it
back to the sender. This simple solution could be also automated, but
administrator had quite a big control over whole process. This might come in
handy if there could be some additional checkups for example, but on the other
hand it can be quite time consuming.
Probably the best solution which is often used and is fairly secure is
following. Let us consider only case in which all users have to fill their
email addresses into the system and these addresses are safely in the hands of
the right users. When user finds out that he/she does not remember a password,
he/she requests a password reset and fill in his/her unique identifier; it might
be email or unique nickname. Based on matched user account the system generates
unique access token and sends it to user via email address. This token should be
time limited and usable only once, so it cannot be misused. User then takes the
token or URL address which is provided in the email and go to the system's
appropriate section, where new password can be set. After that user can sign in
with his/her new password. As previously stated, this solution is quite safe and
user can handle it on its own, so administrator does not have to worry about it.
That is the main reason why this approach was chosen to be used.
#### Uploading files
There are two cases when users need to upload files using the API -- submitting
solutions to an assignment and creating a new exercise. In both of these cases,
the final destination of the files is the fileserver. However, the fileserver is
not publicly accessible, so the files have to be uploaded through the API.
The files can be either forwarded to the fileserver directly, without any
interference from the API server, or stored and forwarded later. We chose the
second approach, which is harder to implement, but more convenient -- it lets
exercise authors double-check what they upload to the fileserver and solutions
to assignments can be uploaded in a single request, which makes it easy for the
fileserver to create an archive of the solution files.
8 years ago
#### Permissions
In a system storing user data has to be implemented some kind of permission
checking. Previous chapters implies, that each user has to have a role, which
corresponds to his/her privileges. Our research showed, that three roles are
sufficient -- student, supervisor and administrator. The user role has to be
checked with every request. The good points is, that roles nicely match with
8 years ago
granularity of API endpoints, so the permission checking can be done at the
8 years ago
beginning of each request. That is implemented using PHP annotations, which
allows to specify allowed user roles for each request with very little of code,
but all the business logic is the same, together in one place.
However, roles cannot cover all cases. For example, if user is a supervisor, it
relates only to groups, where he/she is a supervisor. But using only roles
allows him/her to act as supervisor in all groups in the system. Unfortunately,
this cannot be easily fixed using some annotations, because there are many
different cases when this problem occurs. To fix that, some additional checks
can be performed at the beginning of request processing. Usually it is only one
or two simple conditions.
With this two concepts together it is possible to easily cover all cases of
permission checking with quite a small amount of code.
8 years ago
#### Solution loading
When a solution evaluation on the backend is finished, the results are saved to
the fileserver and the API is notified by the broker. Some further steps needs
to be done at that moment before the results can be presented to the users.
Some of these steps are parsing of the results, calculation of the final score,
or saving the structured data into the database. There are two main
possibilities when to process the results:
- immediately after the API server is notified by the backend
- when a user requests the results for the first time
These options are almost equal, none of them provides any kind of a big
advantage. Loading solutions immediately is better, because fetching results
by the client for the first time can be a bit faster as the results are already
processed. On the other hand, processing the results on demand can save some of
the resources when the solution results are not important (e.g., the student
finds a bug in his solution before the submission has been evaluated).
We decided for the lazy loading at the time when the results are requested for
the first time. However, the concept of asynchronous jobs is then introduced.
This type of job is useful for batch submitting of jobs, for example re-running
jobs which failed on a worker hardware issue. These jobs are typically submitted
8 years ago
by different user than the author (an administrator for example), so the
original authors should be notified. In this case it is more reasonable to load
the results immediately and optionally send them a notification via an email.
8 years ago
This is exactly what we do.
It seems with the benefit of hindsight that immediate loading of all jobs could
simplify the code and it has no major drawbacks. In the next version of ReCodEx
we will re-evaluate this decision.
#### Communication with the backend
##### Backend failure reporting
8 years ago
The backend is a separate component which does not communicate with the
administrators directly. When it encounters an error it stores it in a log file.
It would be handy to inform the administrator directly at this moment so he can
fix the cause of the error as soon as possible. The backend does not have any
mechanism for notifying users using for example an email. The API server on the
other hand has email sending implemented and it can easily forward any messages
to the administrator. A secured communication protocol between the backend and
the frontend already exists (it is used for the reporting of a finished job
processing) and it is easy to add another endpoint for bug reporting.
When a request for sending a report arrives from the backend then the type of
the report is inferred and if it is an error which deserves attention of the
administrator then an email is sent to him/her. There can also be errors which
are not that important (e.g., it was somehow solved by the backend itself or it
is only informative, then these do not have to be reported through an email but
can only be stored in the persistent database for further consideration.
On top of that the separate backend component does not have to be exposed to the
outside network at all.
If a job processing fails then the backend informs the API server which
initiated processing of the job. If an error which is not related to
job-processing occurs then the backend must communicate with a given API server
which is configured by the administrator while the other API servers which are
using the same backend are not informed.
##### Backend state monitoring
8 years ago
The next thing related to communication with the backend is monitoring its
current state. This concerns namely which workers are available for processing
different hardware groups and which languages can be therefore used in
exercises.
8 years ago
Another step would be the overall backend state like how many jobs were
processed by some particular worker, workload of the broker and the workers,
etc. The easiest solution is to manage this information by hand, every instance
of the API server has to have an administrator which would have to fill them.
This of course includes only the currently available workers and runtime
environments which does not change very often. The real-time statistics of the
8 years ago
backend cannot be made accessible this way in a reasonable way.
A better solution is to update this information automatically. This can be
done in two ways:
- It can be provided by the backend on-demand if API needs it
- The backend will send these information periodically to the API.
8 years ago
Things like currently available workers or runtime environments are better to be
really up-to-date so this could be provided on-demand if needed. Backend
statistics are not that necessary and could be updated periodically.
8 years ago
However due to the lack of time automatic monitoring of the backend state will
not be implemented in the early versions of this project but might be
implemented in some of the next releases.
### Web-app
8 years ago
The web application is one of the possible client applications of the ReCodEx
system. Creating a web application as a client has several advantages:
- no installation or setup is required on the user's device
- works on all platforms including mobile platforms
- when a new version is rolled out all the clients will use this version without
any need for installing an update manually
One of the downsides is the large number of different web browsers (including
the older versions of a specific browser) and their different interpretation
of the code (HTML, CSS, JS). Some features of the latest specifications of HTML5
are implemented in some browsers which are used by a subset of the Internet
8 years ago
users. This has to be taken into account when choosing appropriate tools
for implementation of a website.
There are two basic ways how to create a website these days:
- **server-side approach** - user's actions are processed on the server and the
8 years ago
HTML code with the results of the action is generated on the server and sent
back to the user's Internet browser. The client does not handle any logic
(apart from rendering of the user interface and some basic user interaction)
and is therefore very simple. The server can use the API server for processing
of the actions so the business logic of the server can be very simple as well.
A disadvantage of this approach is that a lot of redundant data is transferred
across the requests although some parts of the content can be cached (e.g.,
CSS files). This results in longer loading times of the website.
- **server-side rendering with asynchronous updates (AJAX)** - a slightly
different approach is to render the page on the server as in the previous case
but then execute user's actions asynchronously using the `XMLHttpRequest`
JavaScript functionality. Which creates a HTTP request and transfers only the
part of the website which will be updated.
- **client-side approach** - the opposite approach is to transfer the
communication with the API server and the rendering of the HTML completely
from the server directly to the client. The client runs the code (usually
JavaScript) in his/her web browser and the content of the website is generated
based on the data received from the API server. The script file is usually
quite large but it can be cached and does not have to be downloaded from the
server again (until the cached file expires). Only the data from the API
8 years ago
server needs to be transferred over the Internet and thus reduce the volume of
8 years ago
payload on each request which leads to a much more responsive user experience,
especially on slower networks. Since the client-side code has full control
over the UI and a more sophisticated user interactions with the UI can be
achieved.
All of these approaches are used in production by the web developers and all
of them are well documented and there are mature tools for creating websites
using any of these approaches.
8 years ago
We decided to use the third approach -- to create a fully client-side
application which would be familiar and intuitive for a user who is used to
modern web applications.
8 years ago
@todo: please think about more stuff about api and web-app... thanks ;-)
# User documentation
Users interact with the ReCodEx through the web application. It is required to
8 years ago
use a modern web browser with good HTML5 and CSS3 support. Among others, cookies
and local storage are used. Also a decent JavaScript runtime must be provided by
the browser.
Supported and tested browsers are: Firefox 50+, Chrome 55+, Opera 42+ and Edge
8 years ago
13+. Mobile devices often have problems with internationalization and possibly
lack support for some common features of desktop browsers. In this stage of
development is not possible for us to fine tune the interface for major mobile
browsers on all mobile platforms. However, it is confirmed to work with latest
Google Chrome and Gello browser on Android 7.1+. Issues have been reported with
Firefox that will be fixed in the future. Also, it is confirmed to work with
Safari browser on iOS 10.
Usage of the web application is divided into sections concerning particular user
roles. Under these sections all possible use cases can be found. These sections
are inclusive, so more privileged users need to read instructions for all less
privileged users. Described roles are:
8 years ago
- Student
- Group supervisor
- Group administrator
- Instance administrator
8 years ago
- Superadministrator
## Terminology
**Instance** -- Represents a university, company or some other organization
8 years ago
unit. Multiple instances can exist in a single ReCodEx installation.
8 years ago
**Group** -- A group of students to which exercises are assigned by a
8 years ago
supervisor. It should typically correspond with a real world lab group.
8 years ago
**User** -- A person that interacts with the system using the web interface (or
8 years ago
an alternative client).
8 years ago
**Student** -- A user with least privileges who is subscribed to some groups and
8 years ago
submits solutions to exercise assignments.
8 years ago
**Supervisor** -- A person responsible for assigning exercises to a group and
8 years ago
reviewing submissions.
8 years ago
**Admin** -- A person responsible for the maintenance of the system and fixing
8 years ago
problems supervisors cannot solve.
8 years ago
8 years ago
**Exercise** -- An algorithmic problem that can be assigned to a group. They
can be shared by the teachers using an exercise database in ReCodEx.
8 years ago
**Assignment** -- An exercise assigned to a group, possibly with modifications.
8 years ago
**Runtime environment** -- Runtime environment is unique combination of platform
(OS) and programming language runtime/compiler in specific version. Runtime
environments are managed by the administrators to reflect abilities of whole
system.
8 years ago
**Hardware group** -- Hardware group is a set of workers with similar hardware.
Its purpose is to group workers that are likely to run a program using the same
amount of resources. Hardware groups are managed byt the system administrators
who have to keep them up-to-date.
## General basics
8 years ago
Description of general basics which are the same for all users of ReCodEx web
application follows.
### First steps in ReCodEx
8 years ago
You can create an account by clicking the "*Create account*" menu item in
8 years ago
the left sidebar. You can choose between two types of registration methods -- by
8 years ago
creating a local account with a specific password, or pairing your new account
with an existing CAS UK account.
8 years ago
If you decide to create a new "*local*" account using the "*Create ReCodEx
account*” form, you will have to provide your details and choose a password for
your account. Although ReCodEx allows using quite weak passwords, it is wise to
use a bit stronger ones The actual strength is shown in progress bar near the
password field during registration. You will later sign in using your email
address as your username and the password you select.
8 years ago
8 years ago
If you decide to use the CAS UK service, then ReCodEx will verify your CAS
credentials and create a new account based on information stored there (name and
email address). You can change your personal information later on the
"*Settings*" page.
8 years ago
8 years ago
Regardless of the desired account type, an instance it will belong to must be
selected. The instance will be most likely your university or other organization
you are a member of.
8 years ago
To log in, go to the homepage of ReCodEx and in the left sidebar choose the menu
8 years ago
item "*Sign in*". Then you must enter your credentials into one of the two forms
8 years ago
-- if you selected a password during registration, then you should sign with
8 years ago
your email and password in the first form called "*Sign into ReCodEx*". If you
8 years ago
registered using the Charles University Authentication Service (CAS), you should
put your students number and your CAS password into the second form called
8 years ago
"*Sign into ReCodEx using CAS UK*".
There are several options you can edit in your user account:
8 years ago
- changing your personal information (i.e., name)
- changing your credentials (email and password)
- updating your preferences (source code viewer/editor settings, default
language)
8 years ago
You can access the settings page through the "*Settings*" button right under
8 years ago
your name in the left sidebar.
8 years ago
If you are not active in ReCodEx for a whole day, you will be logged out
automatically. However, we recommend you sign out of the application after you
finish your interaction with it. The logout button is placed in the top section
of the left sidebar right under your name. You may need to expand the sidebar
with a button next to the "*ReCodEx*” title (informally known as _hamburger
button_), depending on your screen size.
### Forgotten password
8 years ago
If you cannot remember your password and you do not use CAS UK authentication,
8 years ago
then you can reset your password. You will find a link saying "Cannot remember
what your password was? Reset your password." under the sign in form. After you
8 years ago
click this link, you will be asked to submit your registration email address. A
message with a link containing a special token will be sent to you by e-mail --
we make sure that the person who requested password resetting is really you.
When you visit the link, you will be able to enter a new password for your
account. The token is valid only for a couple of minutes, so do not forget to
reset the password as soon as possible, or you will have to request a new link
with a valid token.
If you sign in through CAS UK, then please follow the instructions
provided by the administrators of the service described on their
website.
### Dashboard
8 years ago
When you log into the system you should be redirected to your "*Dashboard*". On
8 years ago
this page you can see some brief information about the groups you are member of.
The information presented there varies with your role in the system -- further
description of dashboard will be provided later on with according roles.
## Student
8 years ago
Student is a default role for every newly registered user. This role has quite
8 years ago
limited capabilites in ReCodEx. Generally, a student can only submit solutions
of exercises in some particular groups. These groups should correspond to
courses he/she attends.
8 years ago
8 years ago
On the "*Dashboard*" page there is "Groups you are student of" section where you
can find list of your student groups. In first column of every row there is a
brief panel describing concerning group. There is name of the group and
8 years ago
percentage of gained points from course. If you have enough points to
8 years ago
successfully complete the course then this panel has green background with tick
sign. In the second column there is a list of assigned exercises with its
deadlines. If you want to quickly get to the groups page you might want to use
provided "Show group's detail" button.
### Join group and start solving assignments
8 years ago
To be able to submit solutions you have to be a member of the right group. Each
instance has its own group hierarchy, so you can choose only those within your
instance. That is why a list of groups is available from under an instance link
located in the sidebar. This link brings you to instance detail page.
8 years ago
In there you can see a description of the instance and most importantly in
8 years ago
"Groups hierarchy" box there is a hierarchical list of all public groups in the
8 years ago
instance. Please note that groups with plus sign are collapsible and can be
8 years ago
further extended. When you find a group you would like to join, continue by
clicking on "See group's page" link following with "Join group" link.
8 years ago
**Note:** Some groups can be marked as private and these groups are not visible
in hierarchy and membership cannot be established by students themselves.
Management of students in this type of groups is in the hands of supervisors.
8 years ago
On the group detail page there are multiple interesting things for you. First
one is brief overview with information describing the group, there is list with
8 years ago
supervisors and also hierarchy of subgroups. Most importantly, there is the
8 years ago
"Student's dashboard" section. This section contains list of assignments and
8 years ago
a list of fellow students. If supervisors of groups allowed students to see each
other's statistics there will also be the number of points the students gained.
8 years ago
In the "Assignments" box on the group detail page there is a list of assigned
8 years ago
exercises which students are supposed to solve. The assignments are displayed
with their names and deadlines. There are possibly two deadlines, the first one
means that till this datetime student will receive full amount of points in case
8 years ago
of successful solution. Second deadline does not have to be set, but in case it
is, the maximum number of points for successful solution between these two
8 years ago
deadlines can be different.
An assignment link will lead you to assignment detail page where are presented
all known details about assignment. There are of course both deadlines, limit of
submissions which you can make and also full-range description of assignment,
8 years ago
which can be localized. The localization can be on demand switched between all
8 years ago
language variants in tab like box.
Further on the page you can find "Submitted solutions" box where is a list of
submissions with links to result details. But most importantly there is a
"Submit new solution" button on the assignment page which provides an interface
to submit solution of the assignment.
After clicking on submit button, dialog window will show up. In here you can
upload files representing your solution, you can even add some notes to mark the
8 years ago
solution. Your supervisor can also access this note. After you successfully
upload all files necessary for your solution, click the "Submit your solution"
button and let ReCodEx evaluate the solution.
8 years ago
During the execution ReCodEx backend might send evaluation progress state to
your browser which will be displayed in another dialog window. When the whole
execution is finished then a "See the results" button will appear and you can
look at the results of your solution.
8 years ago
On the results detail page there are a lot of information. Apart from assignment
description, which is not connected to your results, there is also the solution
submitter name (supervisor can submit a solution on your behalf), further there
8 years ago
are files which were uploaded on submission and most importantly "Evaluation
details" and "Test results" boxes.
8 years ago
Evaluation details contains overall results of your solution. There are
8 years ago
information such as whether the solution was provided before deadlines, if the
evaluation process successfully finished or if compilation succeeded. After that
you can find a lot of values, most important one is the last, "Total score",
consisting of your score, slash and the maximum number of points for this
assignment. Interestingly the your score value can be higher than the maximum,
which is caused by "Bonus points" item above. If your solution is nice and
8 years ago
supervisor notices it, he/she can assign you additional points for effort. On
the other hand, points can be also subtracted for bad coding habits or even
cheating.
In test results box there is a table of all exercise tests results. Columns
represents these information:
- test case overall result, symbol of yes/no option
- test case name
- percentage of correctness of this particular test
- evaluation status, if test was successfully executed or failed
- memory limit, if supervisor allowed it then percentual memory usage is
displayed
- time limit, if supervisor allowed it then percentual time usage is displayed
8 years ago
A new feature of web application is "Comments and notes" box where you can
communicate with your supervisors or just write random private notes to your
8 years ago
submission. Adding a note is quite simple, you just write it to text field in
the bottom of box and click on the "Send" button. The button with lock image
underneath can switch visibility of newly created comments.
In case you think the ReCodEx evaluation of your solution is wrong, please use
the comments system described above, or even better notify your supervisor by
another channel (email). Unfortunately there is currently no notification
mechanism for new comment messages.
## Group supervisor
8 years ago
Group supervisor is typically the lecturer of a course. A user in this role can
modify group description and properties, assign exercises or manage list of
students. Further permissions like managing subgroups or supervisors is
available only for group administrators.
8 years ago
On "Dashboard" page you can find "Groups you supervise" section. Here there are
boxes representing your groups with the list of students attending course and
8 years ago
their points. Student names are clickable with redirection to user's profile
where further information about his/hers assignments and solution can be found.
To quickly jump onto groups page, use "Show group's detail" button at the bottom
of the matching group box.
8 years ago
### Manage group
8 years ago
Locate group you supervise and you want to manage. All your supervised groups
8 years ago
are available in sidebar under "Groups -- supervisor" collapsible menu. If you
8 years ago
click on one of those you will be redirected to group detail page. In addition
to basic group information you can also see "Supervisor's controls" section. In
this section there are lists of current students and assignments.
8 years ago
As a supervisor of group you are able to see "Edit group settings" button
8 years ago
at the top of the page. Following this link will take you to group editation
page with form containing these fields:
- group name which is visible to other users
8 years ago
- external identification which may be used for pairing with entries in an
information system
8 years ago
- description of group which will be available to users in instance (in
Markdown)
- set if group is publicly visible (and joinable by students) or private
- options to set if students should be able see statistics of each other
8 years ago
- minimal points threshold which students have to gain to successfully complete
8 years ago
the course
8 years ago
8 years ago
After filling all necessary fields the form can be sent by clicking on "Edit
8 years ago
group" button and all changes will be applied.
For students management there are "Students" and "Add student" boxes. The first
8 years ago
one is simple list of all students which are attending the course with the
possibility of delete them from the group. That can be done by hitting "Leave
8 years ago
group" button near particular user. The second box is for adding students to the
8 years ago
group. There is a text field for typing name of the student and after clicking
on the magnifier image or pressing enter key there will appear list of matched
users. At this moment just click on the "Join group" button and student will be
signed in to your group.
### Assigning exercises
8 years ago
Before assigning an exercise, you obviously have to know what exercises are
available. A list of all exercises in the system can be found under "Exercises"
8 years ago
link in sidebar. This page contains a table with exercises names, difficulties
and names of the exercise authors. Further information about exercise is
available by clicking on its name.
8 years ago
On the exercise details page are numerous information about it. There is a box
with all possible localized descriptions and also a box with some additional
information of exercise author, its difficulty, version, etc. There is also a
description for supervisors by exercise author under "Exercise overview" option,
where some important information can be found. And most notably there is an
information about available programming languages for this exercise, under
"Supported runtime environments" section.
8 years ago
If you decide that the exercise is suitable for one of your groups, look for the
8 years ago
"Groups" box at the bottom of the page. There is a list of all groups you
8 years ago
supervise with an "Assign" button which will assign the exercise to the
8 years ago
selected group.
After clicking on the "Assign" button you should be redirected to assignment
editation page. In there you can find two forms, one for editation of assignment
meta information and the second one for setting exercise time and memory limits.
In meta information form you can fill these options:
8 years ago
- name of the assignment which will be visible in a group
8 years ago
- visibility (if an assignment is under construction then you can mark it as not
visible and students will not see it)
- subform for localized descriptions (new localization can be added by clicking
8 years ago
on "Add language variant" button, current one can be deleted with "Remove this
language" button)
8 years ago
- language of description from dropdown field (English, Czech, German)
- description in selected language
- score configuration which will be used on students solution evaluation, you
can find some very simple one already in here, description of score
configuration can be found further in "Writing score configuration" chapter
- first submission deadline
8 years ago
- maximum points that can be gained before the first deadline; if you want to
manage all points manually, set it to 0 and then use bonus points, which are
described in the next subchapter
- second submission deadline, after that students still can submit exercises but
8 years ago
they are given no points no points (must be after the first deadline)
- maximum points that can be gained between first deadline and second deadline
- submission count limit for students' solutions -- limits the amount of
attempts a student has at solving the problem
- visibility of memory and time ratios; if true students can see the percentage
of used memory and time (with respect to the limit) for each test
- minimum percentage of points which each submission must gain to be considered
correct (if it gets less, it will gain no points)
- whether the assignment is marked as bonus one and points from solving it are
not included into group threshold limit (that means solving it can get you
additional points over the limit)
8 years ago
The form has to be submitted with "Edit settings" button otherwise changes will
not be saved.
The same editation page serves also for the purpose of assignment editation, not
8 years ago
only creation. That is why on bottom of the page "Delete the assignment" box
can be found. Clearly the button "Delete" in there can be used to unassign
exercise from group.
8 years ago
The last unexplored area is the time and memory limits form. The whole form is
situated in a box with tabs which are leading to particular runtime
environments. If you wish not to use one of those, locate "Remove" button at the
bottom of the box tab which will delete this environment from the assignment.
Please note that this action is irreversible.
8 years ago
In general, every tab in environments box contains some basic information about
runtime environment and another nested tabbed box. In there you can find all
8 years ago
hardware groups which are available for the exercise and set limits for all test
8 years ago
cases. The time limits have to be filled in seconds (float), memory limits are
in bytes (int). If you are interested in some reference values to particular
8 years ago
test case then you can take a peek on collapsible "Reference solutions'
evaluations" items. If you are satisfied with changes you made to the limits,
8 years ago
save the form with "Change limits" button right under environments box.
### Students' solutions management
8 years ago
One of the most important tasks of a group supervisor is checking student
solutions. As automatic evaluation of them cannot catch all problems in the
source code, it is advisable to do a brief manual review of student's coding
style and reflect that in assignment bonus points.
On "Assignment detail" page there is an "View student results" button near top
of the page (next to "Edit assignment settings" button). This will redirect you
to a page where is a list of boxes, one box per student. Each student box
contains a list of submissions for this assignment. The row structure of
submission list is the same as the structure in student's "Submitted solution"
box. More information about every solution can be showed by clicking on "Show
details" link on the end of solution row.
This page is the same as for students with one exception -- there is an
additional collapsed box "Set bonus points". In unfolded state, there is an
input field for one number (positive or negative integer) and confirmation
button "Set bonus points". After filling intended amount of points and
8 years ago
submitting the form, the data in "Evaluation details" box get immediately
updated. To remove assigned bonus points, submit just the zero number. The bonus
points are not additive, newer value overrides older values.
It is useful to give a feedback about the solution back to the user. For this
8 years ago
you can use the "Commens and notes" box. Make sure that the messages are not
private, so that the student can see them. More detailed description of this box
8 years ago
can be nicely used the "Comments and notes" box. Make sure that the messages are
not private, so the student can see them. More detailed description of this box
is available in student part of user documentation.
One of the discussed concept was marking one solution as accepted. However, due
to lack of frontend developers it is not yet prepared in user interface. We
hope, it will be ready as soon as possible. The button for accepting a solution
8 years ago
will be most probably also on this page.
### Creating exercises
Link to exercise creation can be found in exercises list which is accessible
through "Exercises" link in sidebar. On the bottom of the exercises list page
you can find "Add exercise" button which will redirect you to exercise editation
page. In this moment exercise is already created so if you just leave this page
exercise will stay in the database. This is also reason why exercise creation
form is the same as the exercise editation form.
8 years ago
Exercise editation page is divided into three separate forms. First one is
supposed to contain meta information about exercise, second one is used for
uploading and management of supplementary files and third one manages runtime
configuration in which exercise can be executed.
First form is located in "Edit exercise settings" and generally contains meta
information needed by frontend which are somehow somewhere visible. In here you
can define:
- exercise name which will be visible to other supervisors
- difficulty of exercise (easy, medium, hard)
- description which will be available only for visitors, may be used for further
description of exercise (for example information about test cases and how they
could be scored)
- private/public switch, if exercise is private then only you as author can see
it, assign it or modify it
- subform containing localized descriptions of exercise, new one can be added
with "Add language variant" button and current one deleted with "Remove this
language"
8 years ago
- language in which this particular description is in (Czech, English,
German)
- actual localized description of exercise
After all information is properly set form has to be submitted with "Edit
settings" button.
Management of supplementary files can be found in "Supplementary files" box.
Supplementary files are files which you can use further in job configurations
which have to be provided in all runtime configurations. These files are
uploaded directly to fileserver from where worker can download them and use
during execution according to job configuration.
Files can be uploaded either by drag and drop mechanism or by standard "Add a
file" button. In opened dialog window choose file which should be uploaded. All
8 years ago
chosen files are immediately uploaded to server but to save supplementary files
list you have to hit "Save supplementary files" button. All previously uploaded
files are visible right under drag and drop area, please note that files are
stored on fileserver and cannot be deleted after upload.
The last form on exercise editation page is runtime configurations editation
form. Exercise can have multiple runtime configurations according to the number
of programming languages in which it can be run. Every runtime configuration
corresponds to one programming language because all of them has to have a bit
different job configuration.
New runtime configuration can be added with "Add new runtime configuration"
button this will spawn new tab in runtime configurations box. In here you can
fill following:
- human readable identifier of runtime configuration
- runtime environment which corresponds to programming language
- job configuration in YAML, detailed description of job configuration can be
found further in this chapter in "Writing job configuration" section
If you are done with changes to runtime configurations save form with "Change
runtime configurations" button. If you want to delete some particular runtime
just hit "Remove" button in the right tab, please note that after this operation
runtime configurations form has to be again saved to apply changes.
8 years ago
All runtime configurations which were added to exercise will be visible to
supervisors and all can be used in assignment, so please be sure that all of the
languages and job configurations are working.
8 years ago
If you choose to delete exercise, at the bottom of the exercise editation page
you can find "Delete the exercise" box where "Delete" button is located. By
clicking on it exercise will be delete from the exercises list and will no
longer be available.
### Exercise's reference solutions
Each exercise should have a set of reference solutions, which are used to tune
time and memory limits of assignments. Values of used time and memory for each
solution are displayed in yellow boxes under forms for setting assignment limits
as described earlier.
8 years ago
However, there is currently no user interface to upload and evaluate reference
8 years ago
solutions. It is possible to use direct REST API calls, but it is not very user
8 years ago
However, there is currently no user interface to upload and evaluate reference
solutions. It is possible to use direct REST API calls, but it is not much user
friendly. If you are interested, please look at [API
documentation](https://recodex.github.io/api/), notably sections
_Uploaded-Files_ and _Reference-Exercise-Solutions_. You need to upload the
reference solution files, create a new reference solution and then evaluate the
solution. After that, measured data will be available in the box at assignment
editing page (setting limits section).
8 years ago
We are now working on a better user interface, which will be available soon.
Then its description will be added here.
## Group administrator
8 years ago
Group administrator is the group supervisor with some additional permissions in
8 years ago
particular group. Namely group administrator is capable of creating a subgroups
8 years ago
in managed group and also adding and deleting supervisors. Administrator of the
particular group can be only one person.
### Creating subgroups and managing supervisors
8 years ago
There is no special link which will get you to groups in which you are
administrator. So you have to get there through "Groups - supervisor" link in
sidebar and choose the right group detail page. If you are there you can see
"Administrator controls" section, here you can either add supervisor to group or
create new subgroup.
8 years ago
Form for creating a subgroup is present right on the group detail page in "Add
8 years ago
subgroup" box. Group can be created with following options:
- name which will be visible in group hierarchy
- external identification, can be for instance ID of group from school system
- some brief description about group
- allow or deny users to see each others statistics from assignments
8 years ago
After filling all the information a group can be created by clicking on "Create
new group" button. If creation is successful then the group is visible in
"Groups hierarchy" box on the top of page. All information filled during
8 years ago
creation can be later modified.
8 years ago
Adding a supervisor to a group is rather easy, on group detail page is an "Add
8 years ago
supervisor" box which contains text field. In there you can type name or
8 years ago
username of any user from system. After filling user name, click on the
magnifier image or press the enter key and all suitable users are searched. If
your chosen supervisor is in the updated list then just click on the "Make
supervisor" button and new supervisor should be successfully set.
Also, existing supervisor can be removed from the group. On the group detail
page there is "Supervisors" box in which all supervisors of the group are
visible. If you are the group administrator, you can see there "Remove
supervisor" buttons right next to supervisors names. After clicking on it some
particular supervisor should not to be supervisor of the group anymore.
## Instance administrator
8 years ago
Instance administrator can be only one person per instance. In addition to
previous roles this administrator should be able to modify the instance details,
8 years ago
manage licences and take care of top level groups which belong to the instance.
8 years ago
### Instance management
8 years ago
List of all instances in the system can be found under "Instances" link in the
sidebar. On that page there is a table of instances with their respective
admins. If you are one of them, you can visit its page by clicking on the
instance name. On the instance details page you can find a description of the
instance, current groups hierarchy and a form for creating a new group.
8 years ago
8 years ago
If you want to change some of the instance settings, follow "Edit instance" link
on the instance details page. This will take you to the instance editation page
with corresponding form. In there you can fill following information:
8 years ago
- name of the instance which will be visible to every other user
- brief description of instance and for whom it is intended
- checkbox if instance is open or not which means public or private (hidden from
8 years ago
potential users)
8 years ago
8 years ago
If you are done with your editation, save filled information by clicking on
"Update instance" button.
8 years ago
8 years ago
If you go back to the instance details page you can find there a "Create new
group" box which is able to add a group to the instance. This form is the same
as the one for creating subgroup in already existing group so we can skip
description of the form fields. After successful creation of the group it will
appear in "Groups hierarchy" box at the top of the page.
8 years ago
8 years ago
### licences
8 years ago
On the instance details page, there is a box "Licences". On the first line, it
shows it this instance has currently valid licence or not. Then, there are
multiple lines with all licences assigned to this instance. Each line consists of
a note, validity status (if it is valid or revoked by superadministrator) and
the last date of licence validity.
A box "Add new licence" is used for creating new licences. Required fields are
the note and the last day of validity. It is not possible to extend licence
lifetime, a new one should be generated instead. It is possible to have more
8 years ago
than one valid licence at a time. Currently there is no user interface for
8 years ago
revoking licences, this is done manually by superadministrator. If an instance
is to be disabled, all valid licences have to be revoked.
## Superadministrator
8 years ago
Superadministrator is a user with the most privileges and as such superadmin
8 years ago
should be quite a unique role. Ideally, there should be only one user of this
8 years ago
kind, used with special caution and adequate security. With this stated it is
obvious that superadmin can perform any action the API is capable of.
### Users management
8 years ago
There are only a few user roles in ReCodEx. Basically there are only three:
_student_, _supervisor_, and _superadmin_. Base role is student which is
assigned to every registered user. Roles are stored in database alongside other
information about user. One user always has only one role at the time. At first
startup of ReCodEx, the administrator has to change the role for his/her account
manually in the database. After that manual intervention into database should
never be needed.
There is a little catch in groups and instances management. Groups can have
admins and supervisors. This setting is valid only per one particular group and
has to be separated from basic role system. This implies that supervisor in one
group can be student in another and simultaneously have global supervisor role.
8 years ago
Changing role from student to supervisor and back is done automatically when the
new privileges are granted to the user, so managing roles by hand in database is
not needed. Previously stated information can be applied to instances as well,
but instances can only have admins.
Roles description:
- Student -- Default role which is used for newly created accounts. Student can
join or leave public groups and submit solutions of assigned exercises.
- Supervisor -- Inherits all permissions from student role. Can manage groups to
which he/she belongs to. Supervisor can also view and change groups details,
manage assigned exercises, view students in group and their solutions for
assigned exercises. On top of that supervisor can create/delete groups too,
but only as subgroup of groups he/she belongs to.
- Superadmin -- Inherits all permissions from supervisor role. Most powerful
8 years ago
user in ReCodEx who should be able to do access any functionality provided by
the application.
8 years ago
## Writing score configuration
8 years ago
An important thing about assignment is how to assign points to particular
solutions. As mentioned previously, the whole job is composed of logical tests.
All of these tests have to contain one essential "evaluation" task. Evaluation
task should output one float number which can be further used for scoring of
particular tests.
Total resulting score of the students solution is then calculated according to a
supplied score config (described below) and using specified calculator. Total
score is also a float between 0 and 1. This number is then multiplied by the
maximum of points awarded for the assignment by the teacher assigning the
exercise -- not the exercise author.
8 years ago
For now, there is only one way how to write score configuration using only
simple score calculator. But the implementation in API is agile enough to handle
upcoming score calculators which might use some more complex scoring algorithms.
8 years ago
This also means that future calculators do not have to use the YAML format for
configuration. In fact, the configuration can be a string in any format.
### Simple score calculation
First implemented calculator is simple score calculator with test weights. This
calculator just looks at the score of each test and put them together according
to the test weights specified in assignment configuration. Resulting score is
calculated as a sum of products of score and weight of each test divided by the
sum of all weights. The algorithm in Python would look something like this:
```
sum = 0
weightSum = 0
for t in tests:
sum += t.score * t.weight
weightSum += t.weight
score = sum / weightSum
```
Sample score config in YAML format:
```{.yml}
testWeights:
a: 300 # test with id 'a' has a weight of 300
b: 200
c: 100
d: 100
```
## Writing job configuration
To run and evaluate an exercise the backend needs to know the steps how to do
that. This is different for each environment (operation system, programming
language, etc.), so each of the environments needs to have separate
configuration.
Backend works with a powerful, but quite low level description of simple
connected tasks written in YAML syntax. More about the syntax and general task
overview can be found on [separate
page](https://github.com/ReCodEx/wiki/wiki/Assignments). One of the planned
features was user friendly configuration editor, but due to tight deadline and
team composition it did not make it to the first release. However, writing
configuration in the basic format will be always available and allows users to
use the full expressive power of the system.
This section walks through creation of job configuration for _hello world_
exercise. The goal is to compile file _source.c_ and check if it prints `Hello
World!` to the standard output. This is the only test case, let's call it
**A**.
The problem can be split into several tasks:
- compile _source.c_ into _helloworld_ with `/usr/bin/gcc`
- run _helloworld_ and save standard output into _out.txt_
- fetch predefined output (suppose it is already uploaded to fileserver) with
hash `a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b` to _reference.txt_
- compare _out.txt_ and _reference.txt_ by `/usr/bin/diff`
The absolute path of tools can be obtained from system administrator. However,
`/usr/bin/gcc` is location, where the GCC binary is available almost everywhere,
so location of some tools can be (professionally) guessed.
First, write header of the job to the configuration file.
```{.yml}
submission:
job-id: hello-word-job
hw-groups:
- group1
```
Basically it means, that the job _hello-world-job_ needs to be run on workers
that belong to the `group_1` hardware group . Reference files are downloaded
from the default location configured in API (such as
`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job
execution log will not be saved to result archive.
Next the tasks have to be constructed under _tasks_ section. In this demo job,
every task depends only on previous one. The first task has input file
_source.c_ (if submitted by user) already available in working directory, so
just call the GCC. Compilation is run in sandbox as any other external program
and should have relaxed time and memory limits. In this scenario, worker
defaults are used. If compilation fails, the whole job is immediately terminated
(because the _fatal-failure_ bit is set). Because _bound-directories_ option in
sandbox limits section is mostly shared between all tasks, it can be set in
worker configuration instead of job configuration (suppose this for following
tasks). For configuration of workers please contact your administrator.
```{.yml}
- task-id: "compilation"
type: "initiation"
fatal-failure: true
cmd:
bin: "/usr/bin/gcc"
args:
- "source.c"
- "-o"
- "helloworld"
sandbox:
name: "isolate"
limits:
- hw-group-id: group1
chdir: ${EVAL_DIR}
bound-directories:
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
```
The compiled program is executed with time and memory limit set and the standard
output is redirected to a file. This task depends on _compilation_ task, because
the program cannot be executed without being compiled first. It is important to
mark this task with _execution_ type, so exceeded limits will be reported in
frontend.
8 years ago
Time and memory limits set directly for a task have higher priority than worker
defaults. One important constraint is, that these limits cannot exceed limits
set by workers. Worker defaults are present as a safety measure so that a
malformed job configuration cannot block the worker forever. Worker default
limits should be reasonably high, like a gigabyte of memory and several hours of
execution time. For exact numbers please contact your administrator.
8 years ago
It is important to know that if the output of a program (both standard and
error) is redirected to a file, the sandbox disk quotas apply to that file, as
well as the files created directly by the program. In case the outputs are
ignored, they are redirected to `/dev/null`, which means there is no limit on
the output length (as long as the printing fits in the time limit).
8 years ago
```{.yml}
- task-id: "execution_1"
test-id: "A"
type: "execution"
dependencies:
- compilation
cmd:
bin: "helloworld"
sandbox:
name: "isolate"
stdout: ${EVAL_DIR}/out.txt
limits:
- hw-group-id: group1
chdir: ${EVAL_DIR}
bound-directories:
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
time: 0.5
memory: 8192
```
Fetch sample solution from file server. Base URL of file server is in the header
of the job configuration, so only the name of required file (its `sha1sum` in
our case) is necessary.
```{.yml}
- task-id: "fetch_solution_1"
test-id: "A"
dependencies:
- execution
cmd:
bin: "fetch"
args:
- "a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"
- "${SOURCE_DIR}/reference.txt"
```
8 years ago
Comparison of results is quite straightforward. It is important to set the task
type to _evaluation_, so that the return code is set to 0 if the program is
correct and 1 otherwise. We do not set our own limits, so the default limits are
used.
8 years ago
```{.yml}
- task-id: "judge_1"
test-id: "A"
type: "evaluation"
dependencies:
- fetch_solution_1
cmd:
bin: "/usr/bin/diff"
args:
- "out.txt"
- "reference.txt"
sandbox:
name: "isolate"
limits:
- hw-group-id: group1
chdir: ${EVAL_DIR}
bound-directories:
- src: ${SOURCE_DIR}
dst: ${EVAL_DIR}
mode: RW
```
8 years ago
# The Backend
The backend is the part which is hidden to the user and which has only
one purpose: evaluate users solutions of their assignments.
@todo: describe the configuration inputs of the Backend
@todo: describe the outputs of the Backend
@todo: describe how the backend receives the inputs and how it
communicates the results
## Components
Whole backend is not just one service/component, it is quite complex system on its own.
@todo: describe the inner parts of the Backend (and refer to the Wiki
for the technical description of the components)
### Broker
@todo: gets stuff done, single point of failure and center point of ReCodEx universe
### Fileserver
@todo: stores particular data from frontend and backend, hashing, HTTP API
### Worker
@todo: describe a bit of internal structure in general
@todo: describe how jobs are generally executed
### Monitor
@todo: not necessary component which can be omitted, proxy-like service
## Backend internal communication
@todo: internal backend communication, what communicates with what and why
The Frontend
============
The frontend is the part which is visible to the user of ReCodEx and
which holds the state of the system the user accounts, their roles in
the system, the database of exercises, the assignments of these
exercises to groups of users (i.e., students), and the solutions and
evaluations of them.
Frontend is split into three parts:
- the server-side REST API (“API”) which holds the business logic and
keeps the state of the system consistent
- the relational database (“DB”) which persists the state of the
system
- the client side application (“client”) which simplifies access to
the API for the common users
The centerpiece of this architecture is the API. This component receives
requests from the users and from the Backend, validates them and
modifies the state of the system and persists this modified state in the
DB.
We have created a web application which can communicate with the API
server and present the information received from the server to the user
in a convenient way. The client can be though any application, which can
send HTTP requests and receive the HTTP responses. Users can use general
applications like [cURL](https://github.com/curl/curl/),
[Postman](https://www.getpostman.com/), or create their own specific
client for ReCodEx API.
Frontend capabilities
---------------------
@todo: describe what the frontend is capable of and how it really works,
what are the limitations and how it can be extended
Terminology
-----------
This project was created for the needs of a university and this fact is
reflected into the terminology used throughout the Frontend. A list of
important terms definitions follows to make the meaning unambiguous.
### User and user roles
*User* is a person who uses the application. User is granted access to
the application once he or she creates an account directly through the
API or the web application. There are several types of user accounts
depending on the set of permissions a so called “role” they have
been granted. Each user receives only the most basic set of permissions
after he or she creates an account and this role can be changed only by
the administrators of the service:
- *Student* is the most basic role. Student can become member of a
group and submit his solutions to his assignments.
8 years ago
- *Supervisor* can be entitled to manage a group of students.
Supervisor can assign exercises to the students who are members of
his groups and review their solutions submitted to
these assignments.
8 years ago
- *Super-admin* is a user with unlimited rights. This user can perform
any action in the system.
There are two implicit changes of roles:
- Once a *student* is added to a group as its supervisor, his role is
upgraded to a *supervisor* role.
8 years ago
- Once a *supervisor* is removed from the lasts group where he is a
supervisor then his role is downgraded to a *student* role.
These mechanisms do not prevent a single user being a supervisor of one
group and student of a different group as supervisors permissions are
superset of students permissions.
### Login
*Login* is a set of users credentials he must submit to verify he can
be allowed to access the system as a specific user. We distinguish two
types of logins: local and external.
- *Local login* is users email address and a password he chooses
during registration.
- *External login* is a mapping of a user profile to an account of
some authentication service (e.g., [CAS](https://ldap1.cuni.cz/)).
### Instance
*An instance* of ReCodEx is in fact just a set of groups and user
accounts. An instance should correspond to a real entity as a
university, a high-school, an IT company or an HR agency. This approach
enables the system to be shared by multiple independent organizations
without interfering with each other.
8 years ago
Usage of the system by the users of an instance can be limited by
8 years ago
possessing a valid licence. It is up to the administrators of the system
to determine the conditions under which they will assign licences to the
instances.
### Group
*Group* corresponds to a school class or some other unit which gathers
users who will be assigned the same set exercises. Each group can have
multiple supervisors who can manage the students and the list of
assignments.
Groups can form a tree hierarchy of arbitrary depth. This is inspired by the
hierarchy of school classes belonging to the same subject over several school
years. For example, there can be a top level group for a programming class that
contains subgroups for every school year. These groups can then by divided into
actual student groups with respect to lab attendance. Supervisors can create
subgroups of their groups and further manage these subgroups.
### Exercise
*An exercise* consists of textual assignment of a task and a definition
of how a solution to this exercise should be processed and evaluated in
a specific runtime environment (i.e., how to compile a submitted source
code and how to test the correctness of the program). It is a template
which can be instantiated as an *assignment* by a supervisor of a group.
### Assignment
An assignment is an instance of an *exercise* assigned to a specific
*group*. An assignment can modify the text of the task assignment and it
has some additional information which is specific to the group (e.g., a
deadline, the number of points gained for a correct solution, additional
hints for the students in the assignment). The text of the assignment
can be edited and supervisors can translate the assignment into another
language.
### Solution
*A solution* is a set of files which a user submits to a given
*assignment*.
8 years ago
### Submission
*A submission* corresponds to a *solution* being evaluated by the
Backend. A single *solution* can be submitted repeatedly (e.g., when the
Backend encounters an error or when the supervisor changes the assignment).
### Evaluation
*An evaluation* is the processed report received from the Backend after
a *submission* is processed. Evaluation contains points given to the
user based on the quality of his solution measured by the Backend and
the settings of the assignment. Supervisors can review the evaluation
and add bonus points (both positive and negative) if the student
deserves some.
### Runtime environment
*A runtime environment* defines the used programming language or tools
which are needed to process and evaluate a solution. Examples of a
runtime environment can be:
- *Linux + GCC*
- *Linux + Mono*
- *Windows + .NET 4*
- *Bison + Yacc*
### Limits
A correct *solution* of an *assignment* has to pass all specified tests (mostly
checks that it yields the correct output for various inputs) and typically must
also be effective in some sense. The Backend measures the time and memory
consumption of the solution while running. This consumption of resources can be
*limited* and the solution will receive fewer points if it exceeds the given
limits in some test cases defined by the *exercise*.
User management
---------------
@todo: roles and their rights, adding/removing different users, how the
role of a specific user changes
Instances and hierarchy of groups
---------------------------------
8 years ago
@todo: What is an instance, how to create one, what are the licences and
how do they work. Why can the groups form hierarchies and what are the
benefits what it means to be an admin of a group, hierarchy of roles
in the group hierarchy.
Exercises database
------------------
@todo: How the exercises are stored, accessed, who can edit what
8 years ago
### Creating a new exercise
@todo Localized assignments, default settings
### Runtime environments and hardware groups
@todo read this later and see if it still makes sense
ReCodEx is designed to utilize a rather diverse set of workers -- there can be
differences in many aspects, such as the actual hardware running the worker
(which impacts the results of measuring) or installed compilers, interpreters
and other tools needed for evaluation. To address these two examples in
particular, we assign runtime environments and hardware groups to exercises.
The purpose of runtime environments is to specify which tools (and often also
operating system) are required to evaluate a solution of the exercise -- for
example, a C# programming exercise can be evaluated on a Linux worker running
Mono or a Windows worker with the .NET runtime. Such exercise would be assigned
two runtime environments, `Linux+Mono` and `Windows+.NET` (the environment names
are arbitrary strings configured by the administrator).
A hardware group is a set of workers that run on similar hardware (e.g. a
particular quad-core processor model and a SSD hard drive). Workers are assigned
to these groups by the administrator. If this is done correctly, performance
measurements of a submission should yield the same results. Thanks to this fact,
we can use the same resource limits on every worker in a hardware group.
However, limits can differ between runtime environments -- formally speaking,
limits are a function of three arguments: an assignment, a hardware group and a
runtime environment.
### Reference solutions
@todo: how to add one, how to evaluate it
The task of determining appropriate resource limits for exercises is difficult
to do correctly. To aid exercise authors and group supervisors, ReCodEx supports
assigning reference solutions to exercises. Those are example programs that
should cover the main approaches to the implementation. For example, searching
for an integer in an ordered array can be done with a linear search, or better,
using a binary search.
Reference solutions can be evaluated on demand, using a selected hardware group.
The evaluation results are stored and can be used later to determine limits. In
our example problem, we could configure the limits so that the linear
search-based program doesn't finish in time on larger inputs, but a binary
search does.
Note that separate reference solutions should be supplied for all supported
runtime environments.
### Exercise assignments
@todo: Creating instances of an exercise for a specific group of users,
capabilities of settings. Editing limits according to the reference
solution.
Evaluation process
------------------
@todo: How the evaluation process works on the Frontend side.
### Uploading files and file storage
@todo: One by one upload endpoint. Explain different types of the
Uploaded files.
### Automatic detection of the runtime environment
@todo: Users must submit correctly named files assuming the RTE from
the extensions.
REST API implementation
-----------------------
@todo: What is the REST API, what are the basic principles GET, POST,
Headers, JSON.
### Authentication and authorization scopes
@todo: How authentication works signed JWT, headers, expiration,
refreshing. Token scopes usage.
### HTTP requests handling
@todo: Router and routes with specific HTTP methods, preflight, required
headers
### HTTP responses format
@todo: Describe the JSON structure convention of success and error
responses
### Used technologies
@todo: PHP7 how it is used for typehints, Nette framework how it is
used for routing, Presenters actions endpoints, exceptions and
ErrorPresenter, Doctrine 2 database abstraction, entities and
repositories + conventions, Communication over ZMQ describe the
problem with the extension and how we reported it and how to treat it in
the future when the bug is solved. Relational database we use MariaDB,
Doctine enables us to switch the engine to a different engine if needed
### Data model
@todo: Describe the code-first approach using the Doctrine entities, how
the entities map onto the database schema (refer to the attached schemas
of entities and relational database models), describe the logical
grouping of entities and how they are related:
- user + settings + logins + ACL
8 years ago
- instance + licences + groups + group membership
- exercise + assignments + localized assignments + runtime
environments + hardware groups
- submission + solution + reference solution + solution evaluation
- comment threads + comments
### API endpoints
@todo: Tell the user about the generated API reference and how the
Swagger UI can be used to access the API directly.
Web Application
---------------
@todo: What is the purpose of the web application and how it interacts
with the REST API.
### Used technologies
@todo: Briefly introduce the used technologies like React, Redux and the
build process. For further details refer to the GitHub wiki
### How to use the application
@todo: Describe the user documentation and the FAQ page.
Backend-Frontend communication protocol
=======================================
@todo: describe the exact methods and respective commands for the
communication
Initiation of a job evaluation
------------------------------
@todo: How does the Frontend initiate the evaluation and how the Backend
can accept it or decline it
Job processing progress monitoring
----------------------------------
When evaluating a job the worker sends progress messages on predefined points of
evaluation chain. The sending place can be on very beginning of the job, when
submit archive is downloaded or at the end of each simple task with its state
(completed, failed, skipped). These messages are sent to broker through existing
ZeroMQ connection. Detailed format of messages can be found on [communication
page](https://github.com/ReCodEx/wiki/wiki/Overall-architecture#commands-from-worker-to-broker).
Broker only resends received progress messages to the monitor component via
ZeroMQ socket. The output message format is the same as the input format.
Monitor parses received messages to JSON format, which is easy to work with in
JavaScript inside web application. All messages are cached (one queue per job)
and can be obtained multiple times through WebSocket communication channel. The
cache is cleared 5 minutes after receiving last message.
Publishing of the results
-------------------------
After job finish the worker packs results directory into single archive and
uploads it to the fileserver through HTTP protocol. The target URL is obtained
from API in headers on job initiation. Then "job done" notification request is
performed to API via broker. Special submissions (reference or asynchronous
submissions) are loaded immediately, other types are loaded on-demand on first
results request.
Loading results means fetching archive from fileserver, parsing the main YAML
file generated by worker and saving data to the database. Also, points are
assigned by score calculator.
8 years ago
<!---
8 years ago
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
8 years ago
-->