|
|
<!---
|
|
|
Notes:
|
|
|
|
|
|
* Dvoustrankovy uvod - co by to melo umet
|
|
|
* Analýza - co se rozhodneme delat, jak by se to dalo delat, pridelit dulezitost
|
|
|
- pak se da odkazat na to, proc jsme co nestihli, zahrnout i advanced featury
|
|
|
- odkazovat se u featur, ze to je v planu v pristich verzi - co je dulezite
|
|
|
a co ne!! Zduvodnit tim, jakou podmnozinu featur nechat, snaze se pak bude
|
|
|
popisovat architektura
|
|
|
* V analyze vysvetlit architekturu
|
|
|
* Related works nechat jako samostatnou kapitolu
|
|
|
* Poradi - pozadavky -> related works -> analyza
|
|
|
* Provazani komponent musi rozumet administrator a tvurce ulohy - obecna
|
|
|
kapitola v analyze - puvodni kapitola o analyze byla povedena, jen se tam
|
|
|
micha seznam zprav nebo co - to nezajima vsechny
|
|
|
* Po obecnym uvodu - rozdelit podle potencialniho ctenare - uzivatel ucitel, pak
|
|
|
uzivatel admin
|
|
|
* Instalacni dokumentace stranou, jako posledni
|
|
|
* Uzivatelaka dokumentace - admin: popis prav, autor uloh: nejobsahlejsi, format
|
|
|
skriptu - ale formulovat tak, ze bude popis na co kde kliknout, jazyk popsat
|
|
|
separatne - v budoucnu to bude irelevantni, je potreba daleko hloubeji - je
|
|
|
treba popsat detailne co eelaji, i treba relativni/absolutni adresy, makra,
|
|
|
kde vidi prekladac knihovny a headery... - kapitola na konci
|
|
|
* Uzivatelska dokumentace pro studenta: vysvetleni
|
|
|
* Jak se boduje uloha - tezko rict, kam to patri - nekde na zacatku? Ale zajima
|
|
|
to vsechny role, ucitel musi vedet, jak to nakonfigurovat - zminit treba i jak
|
|
|
bodovat podle casu a pameti (v analyze nebo v uvodu) - vice vystupu od judge,
|
|
|
interpolace bodu podle vyuziti pameti... je to spis mimo uživatelskou
|
|
|
* Nepsat kde na jake tlacitko kliknout
|
|
|
* Tutorialy - scenare, co udelat kdyz chci neco, vzorove pruchody
|
|
|
* U formularu je nejlepsi kdyz zadna dokumentace neni, doplnit popisky k polim
|
|
|
formularu
|
|
|
* V dokumentaci popsat konfigy nekde separatne - skore, yaml - referencni
|
|
|
dokumentace
|
|
|
* Urcite ne FAQ, vic strukturovane
|
|
|
* Instalaci dohromady na konec
|
|
|
* Programatorska dokumentace - "nejmene ctenaru" - neco uz tam mame, neni to
|
|
|
treba davat do tistene dokumentace - do tistene dokumentace dat odkaz na wiki,
|
|
|
neco v tistene ale byt musi - jaky jazyk, designové rozhodnutí - zdůvodnění
|
|
|
nedávat do úvodní analýzy - k referencnim dokumentacim udelat uvod - "restove
|
|
|
API jsme pojali timto zpusobem, deli se to na tyto skupiny, ..."
|
|
|
* Co zvolena architektura znamena, neco to ma dat i uzivateli, ktery
|
|
|
architekturu nezna, kde je drzenej stav
|
|
|
* Z dokumentace musi byt patrne, co dela knihovna a co se musi udelat rucne -
|
|
|
kolik je to prace - psat to vic pro uzivatele, ktery zna technologie, nezna
|
|
|
knihovny
|
|
|
* Mit soucit s tema, ktery to toho tolik neznaji - jak technologie, tak
|
|
|
architekturu a system CodExu
|
|
|
* Nesedi cisla stranek
|
|
|
* Stazeni ZIPu s vystupy Backendu - roztridit na verejne a tajne, verejne i pro
|
|
|
studenta
|
|
|
-->
|
|
|
|
|
|
# Introduction
|
|
|
|
|
|
Generally, there are many different ways and opinions on how to teach people
|
|
|
something new. However, most people agree that a hands-on experience is one of
|
|
|
the best ways to make the human brain remember a new skill. Learning must be
|
|
|
entertaining and interactive, with fast and frequent feedback. Some kinds of
|
|
|
knowledge are more suitable for this practical type of learning than others, and
|
|
|
fortunately, programming is one of them.
|
|
|
|
|
|
University education system is one of the areas where this knowledge can be
|
|
|
applied. In computer programming, there are several requirements such as the
|
|
|
code being syntactically correct, efficient and easy to read, maintain and
|
|
|
extend. Correctness and efficiency can be tested automatically to help teachers
|
|
|
save time for their research, but reviewing bad design, bad coding habits and
|
|
|
logical mistakes is really hard to automate and requires manpower.
|
|
|
|
|
|
Checking programs written by students takes a lot of time and requires a lot of
|
|
|
mechanical, repetitive work. The first idea of an automatic evaluation system
|
|
|
comes from Stanford University professors in 1965. They implemented a system
|
|
|
which evaluated code in Algol submitted on punch cards. In following years, many
|
|
|
similar products were written.
|
|
|
|
|
|
There are two basic ways of automatically evaluating code -- statically (check
|
|
|
the code without running it; safe, but not very precise) or dynamically (run the
|
|
|
code on testing inputs with checking the outputs against reference ones; needs
|
|
|
sandboxing, but provides good real world experience).
|
|
|
|
|
|
This project focuses on the machine-controlled part of source code evaluation.
|
|
|
First, general concepts of grading systems are observed, new requirements are
|
|
|
specified and project with similar functionality are examined. Also, problems of
|
|
|
the software previously used at Charles University in Prague are briefly
|
|
|
discussed. With acquired knowledge from such projects in production, we set up
|
|
|
goals for the new evaluation system, designed the architecture and implemented a
|
|
|
fully operational solution. The system is now ready for production testing at
|
|
|
the university.
|
|
|
|
|
|
## Assignment
|
|
|
|
|
|
The major goal of this project is to create a grading application that will be
|
|
|
used for programming classes at the Faculty of Mathematics and Physics of the
|
|
|
Charles University in Prague. However, the application should be designed in a
|
|
|
modular fashion to be easily extended or even modified to make other ways of
|
|
|
usage possible.
|
|
|
|
|
|
The system should be capable of dynamic analysis of programming code. It means,
|
|
|
that following four basic steps have to be supported:
|
|
|
|
|
|
1. compile the code and check for compilation errors
|
|
|
2. run compiled binary in a sandbox with predefined inputs
|
|
|
3. check constraints on used amount of memory and time
|
|
|
4. compare program outputs with predefined values
|
|
|
|
|
|
The project has a great starting point -- there is an old grading system
|
|
|
currently used at the university (CodEx), so its flaws and weaknesses can be
|
|
|
addressed. Furthermore, many teachers desire to use and test the new system and
|
|
|
they are willing to consult ideas or problems during development with us.
|
|
|
|
|
|
### Intended usage
|
|
|
|
|
|
The whole system is intended to help both teachers (supervisors) and students.
|
|
|
To achieve this, it is crucial to keep in mind typical usage scenarios of the
|
|
|
system and try to make these tasks as simple as possible.
|
|
|
|
|
|
The system has a database of users. Each user has assigned a role, which
|
|
|
corresponds to his/her privileges. There are user groups reflecting structure of
|
|
|
lectured courses. Groups can be hierarchically ordered to reflect additional
|
|
|
metadata such as the academic year. For example, a reasonable group hierarchy
|
|
|
can look like this:
|
|
|
|
|
|
```
|
|
|
Summer term 2016
|
|
|
|-- Language C# and .NET platform
|
|
|
| |-- Labs Monday 10:30
|
|
|
| `-- Labs Thursday 9:00
|
|
|
|-- Programming I
|
|
|
| |-- Labs Monday 14:00
|
|
|
...
|
|
|
```
|
|
|
|
|
|
In this example, students are members of the leaf groups, the higher level
|
|
|
entities are just for keeping the related groups together. The hierarchy
|
|
|
structure can be modified and altered to fit specific needs of the university or
|
|
|
any other organization, even the flat structure (i.e., no hierarchy) is
|
|
|
possible. One user can be part of multiple groups and on the other hand one
|
|
|
group can have multiple users. Each user can have a specific role for every
|
|
|
group in which is a member, overriding his/her default role in this context.
|
|
|
|
|
|
Database of exercises (algorithmic problems) is another part of the project.
|
|
|
Each exercise consists of a text in multiple language variants, an evaluation
|
|
|
configuration and a set of inputs and reference outputs. Exercises are created
|
|
|
by instructed privileged users. Assigning an exercise to a group means to
|
|
|
choose one of the available exercises and specifying additional properties. An
|
|
|
assignment has a deadline (optionally a second deadline), a maximum amount of
|
|
|
points, a configuration for calculating the final score, a maximum number of
|
|
|
submissions, and a list of supported runtime environements (e.g., programming
|
|
|
languages) including specific time and memory limits for each one.
|
|
|
|
|
|
Typical use cases for supported user roles are illustrated on following UML
|
|
|
diagram:
|
|
|
|
|
|
![System use case diagram](https://github.com/ReCodEx/wiki/raw/master/images/System_use_case.png)
|
|
|
|
|
|
#### Exercise evaluation chain
|
|
|
|
|
|
The most important part of the system is evaluation of solutions submitted by
|
|
|
students. Concepts of consecutive steps from source code to final results
|
|
|
is described in more detail below to give readers solid overview of what have to
|
|
|
happen during evaluation process.
|
|
|
|
|
|
First thing users have to do is to submit their solutions through some user
|
|
|
interface. Then, the system checks assignment invariants (deadlines, count
|
|
|
of submissions, ...) and stores submitted files. The runtime environment is
|
|
|
automatically detected based on input files and suitable exercise configuration
|
|
|
variant is chosen (one exercise can have multiple variants, for example C and
|
|
|
Java languages). Matching exercise configuration is then used for taking care of
|
|
|
evaluation process.
|
|
|
|
|
|
There is a pool of worker computers dedicated to processing jobs. Some of them
|
|
|
may have different environment to allow testing programs in more conditions.
|
|
|
Incoming jobs are scheduled to particular worker depending on its capabilities
|
|
|
and job requirements.
|
|
|
|
|
|
Job processing itself stars with obtaining source files and job configuration.
|
|
|
The configuration is parsed into small tasks with simple piece of work.
|
|
|
Evaluation itself goes in direction of tasks ordering. It is crucial to keep
|
|
|
executive computer secure and stable, so isolated sandboxed environment is used
|
|
|
when dealing with unknown source code. When the execution is finished, results
|
|
|
are saved.
|
|
|
|
|
|
Results from worker contains only output data from processed tasks (this could
|
|
|
be return value, consumed time, ...). On top of that, one value is calculated to
|
|
|
express overall quality of the tested job. It is used as points for final
|
|
|
student grading. Calculation method of this value may be different for each
|
|
|
assignment. Data presented back to users include overview of job parts (which
|
|
|
succeeded and which failed, optionally with reason like "memory limit exceeded")
|
|
|
and achieved score (amount of awarded points).
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
There are bunch of different formal requirements for the system. Some of them
|
|
|
are necessary for any system for source code evaluation, some of them are
|
|
|
specific for university deployment and some of them arose during the ten year
|
|
|
long lifetime of the old system. There are not many ways to improve CodEx
|
|
|
experience from the perspective of a student, but a lot of feature requests come
|
|
|
from administrators and supervisors. The ideas were gathered mostly from our
|
|
|
personal experience with the system and from meetings with faculty staff
|
|
|
involved with the current system.
|
|
|
|
|
|
For clear arrangement all the requirements and wishes are presented grouped by
|
|
|
categories.
|
|
|
|
|
|
### System features
|
|
|
|
|
|
System features represents directly accessible functionality to users of the
|
|
|
system. They describe the evaluation system in general and also university
|
|
|
addons (mostly administrative features).
|
|
|
|
|
|
#### Pure user requirements
|
|
|
|
|
|
- users have their own account in the system
|
|
|
- system users can be members of multiple groups (reflecting courses or labs)
|
|
|
- there is a database of exercises; teachers can create exercises including
|
|
|
textual description, sample inputs and correct reference outputs (for example
|
|
|
"sum all numbers from given file and write the result to the standard
|
|
|
output")
|
|
|
- there is a list of assigned exercises in each group and interface to submit a
|
|
|
solution; teachers can assign an existing exercise to their class with some
|
|
|
specific properties set (deadlines, etc.)
|
|
|
- user can see a list of submitted solutions for each assignment with
|
|
|
corresponding results
|
|
|
- teachers can specify way of computation grading points which will be awarted
|
|
|
to the students depending on the quality of his/her solution for each
|
|
|
assignment extra
|
|
|
- teachers can view detailed data about their students (users of a their groups)
|
|
|
including all submitted solutions; also, each of the solution can be manually
|
|
|
reviewed, commented and assigned additional points (positive or negative)
|
|
|
- one particular solution can be marked as accepted (used for grading this
|
|
|
assignment); otherwise, the solution with most points is used
|
|
|
- teacher can edit student solution and privately resubmit it; optionally saving
|
|
|
all results (including temporary ones)
|
|
|
- localization of all texts (UI and exercises)
|
|
|
- Markdown support for creating exercise texts
|
|
|
- tagging exercises in database and search by these tags
|
|
|
- comments, comments, comments (exercises, tests, solutions, ...)
|
|
|
- plagiarism detection
|
|
|
|
|
|
#### Administrative requirements
|
|
|
|
|
|
- users can use an intuitive user interface for interaction with the system,
|
|
|
mainly for viewing assigned exercises, uploading their own solutions to the
|
|
|
assignments, and viewing the results of the solutions after an automatic
|
|
|
evaluation is finished; the two wanted interfaces are web and command-line
|
|
|
based
|
|
|
- user privilege separation (at least two roles -- _student_ and _supervisor_)
|
|
|
- logging in through a university authentication system (e.g. LDAP)
|
|
|
- SIS (university information system) integration for fetching personal user
|
|
|
data
|
|
|
- safe environment in which the students' solutions are executed
|
|
|
- support for multiple programming environments at once to avoid unacceptable
|
|
|
workload for administrator (maintain separate installation for every course)
|
|
|
and high hardware occupation
|
|
|
- advanced low-level evaluation flow configuration with high-level abstraction
|
|
|
layer for ordinary configuration cases
|
|
|
- use of modern technologies with state-of-the-art compilers
|
|
|
|
|
|
### Technical details
|
|
|
|
|
|
Technical details are requirements of technical character with no direct mapping
|
|
|
to visible parts of system. In ideal word, users should not know about these if
|
|
|
they work properly, but would be at least annoyed if these requirements were not
|
|
|
met. Most notably they are these ones:
|
|
|
|
|
|
- user interface of the system accessible on users' computers without
|
|
|
installation of any kind of additional software
|
|
|
- easy implementation of different user interfaces
|
|
|
- be ready for workload hundreds of students and tens of supervisors
|
|
|
- automated installation of all components
|
|
|
- source code with permissive license allowing further development; this also
|
|
|
applies on used libraries and frameworks
|
|
|
- multi-platform worker supporting at least two major operating systems
|
|
|
|
|
|
### Conclusion
|
|
|
|
|
|
The survey shows that there are a lot of different requirements and wishes for
|
|
|
the new system. When the system is ready it is likely that there will be new
|
|
|
ideas of how to use the system and thus the system must be designed to be easily
|
|
|
extendable, so everyone can develop their own feature. This also means that
|
|
|
widely used programming languages and techniques should be used, so users can
|
|
|
quickly understand the code and make changes.
|
|
|
|
|
|
To find out the current state in the field of automatic grading systems we did a
|
|
|
short market survey on the field of automatic grading systems at universities,
|
|
|
programming contests, and possibly other places where similat tools are
|
|
|
available.
|
|
|
|
|
|
|
|
|
## Related work
|
|
|
|
|
|
This is not a complete list of available evaluators, but only a few projects
|
|
|
which are used these days and can be an inspiration for our project. Each
|
|
|
project from the list has a brief description and some key features mentioned.
|
|
|
|
|
|
### CodEx
|
|
|
|
|
|
Currently used grading solution at the Faculty of Mathematics and Physics of
|
|
|
the Charles University in Prague which was implemented in 2006 by a group
|
|
|
of students. It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
|
|
|
and it has been used with some improvements since then. The original plan was
|
|
|
to use the system only for basic programming courses, but there was a demand
|
|
|
for adapting it for many different subjects.
|
|
|
|
|
|
CodEx is based on dynamic analysis. It features a web-based interface, where
|
|
|
supervisors can assign exercises to their students and the students have a time
|
|
|
window to submit their solutions. Each solution is compiled and run in sandbox
|
|
|
(MO-Eval). The metrics which are checked are: correctness of the output, time
|
|
|
and memory limits. It supports programs written in C, C++, C#, Java, Pascal,
|
|
|
Python and Haskell.
|
|
|
|
|
|
Current system is old, but robust. There were no major security incidents
|
|
|
during its production usage. However, from today's perspective there are
|
|
|
several drawbacks. The main ones are:
|
|
|
|
|
|
- **web interface** -- The web interface is simple and fully functional. But
|
|
|
rapid development in web technologies opens new horizons of how web interface
|
|
|
can be made.
|
|
|
- **web API** -- CodEx offers a very limited XML API based on outdated
|
|
|
technologies that is not sufficient for users who would like to create custom
|
|
|
interfaces such as a command line tool or mobile application.
|
|
|
- **sandboxing** -- MO-Eval sandbox is based on principle of monitoring system
|
|
|
calls and blocking the bad ones. This can be easily done for single-threaded
|
|
|
applications, but proves difficult with multi-threaded ones. In present day,
|
|
|
parallelism is a very important area of computing, so there is requirement to
|
|
|
test multi-threaded applications too.
|
|
|
- **instances** -- Different ways of CodEx usage scenarios requires separate
|
|
|
instances (Programming I and II, Java, C#, etc.). This configuration is not
|
|
|
user friendly (students have to register in each instance separately) and
|
|
|
burdens administrators with unnecessary work. CodEx architecture does not
|
|
|
allow sharing hardware between instances, which results in an inefficient use
|
|
|
of hardware for evaluation.
|
|
|
- **task extensibility** -- There is a need to test and evaluate complicated
|
|
|
programs for classes such as Parallel programming or Compiler principles,
|
|
|
which have a more difficult evaluation chain than simple
|
|
|
compilation/execution/evaluation provided by CodEx.
|
|
|
|
|
|
### Progtest
|
|
|
|
|
|
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT
|
|
|
ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
|
|
|
Bash programming and knowledge-based quizzes. There are several bonus points
|
|
|
and penalties and also a few hints what is failing in the submitted solution. It
|
|
|
is very strict on source code quality, for example `-pedantic` option of GCC,
|
|
|
Valgrind for memory leaks or array boundaries checks via `mudflap` library.
|
|
|
|
|
|
### Codility
|
|
|
|
|
|
[Codility](https://codility.com/) is a web based solution primary targeted to
|
|
|
company recruiters. It is a commercial product available as a SaaS and it
|
|
|
supports 16 programming languages. The
|
|
|
[UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png)
|
|
|
of Codility is [opensource](https://github.com/Codility/cui), the rest of source
|
|
|
code is not available. One interesting feature is 'task timeline' -- captured
|
|
|
progress of writing code for each user.
|
|
|
|
|
|
### CMS
|
|
|
|
|
|
[CMS](http://cms-dev.github.io/index.html) is an opensource distributed system
|
|
|
for running and organizing programming contests. It is written in Python and
|
|
|
contains several modules. CMS supports C/C++, Pascal, Python, PHP, and Java
|
|
|
programming languages. PostgreSQL is a single point of failure, all modules
|
|
|
heavily depend on the database connection. Task evaluation can be only a three
|
|
|
step pipeline -- compilation, execution, evaluation. Execution is performed in
|
|
|
[Isolate](https://github.com/ioi/isolate), sandbox written by the consultant
|
|
|
of our project, Mgr. Martin Mareš, Ph.D.
|
|
|
|
|
|
### MOE
|
|
|
|
|
|
[MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C
|
|
|
and Python. It does not provide a default GUI interface, all actions have to be
|
|
|
performed from command line. The system does not evaluate submissions in real
|
|
|
time, results are computed in batch mode after exercise deadline, using Isolate
|
|
|
for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but
|
|
|
the system is generally obsolete.
|
|
|
|
|
|
### Kattis
|
|
|
|
|
|
[Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean
|
|
|
and functional web UI, but the rest of the application is too simple. A nice
|
|
|
feature is the usage of a [standardized
|
|
|
format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for
|
|
|
exercises. Kattis is primarily used by programming contest organizers, company
|
|
|
recruiters and also some universities.
|
|
|
|
|
|
|
|
|
# Analysis
|
|
|
|
|
|
## ReCodEx goals
|
|
|
|
|
|
@todo: improve and extend this chapter - analysis of user requirements and way we
|
|
|
solve them; exercise is a template for assignment, users are in groups, what is
|
|
|
group, how points are assigned for solutions, ...
|
|
|
|
|
|
@todo: merge with next chapter (Solution concept analysis)
|
|
|
|
|
|
None of the existing systems we came across is capable of all the required
|
|
|
features of the new system. There is no grading system which is designed to
|
|
|
support a complicated evaluation pipeline, so this part is an unexplored field
|
|
|
and has to be designed with caution. Also, no project is modern and extensible
|
|
|
so it could be used as a base for ReCodEx. After considering all these facts,
|
|
|
it was clear that a new system has to be written from scratch. This implies,
|
|
|
that only a subset of all the features will be implemented in the first version,
|
|
|
the other in the following ones.
|
|
|
|
|
|
Gathered features are categorized based on priorities for the whole system. The
|
|
|
highest priority has main functionality similar to current CodEx. It is a base
|
|
|
line to be useful in production environment, but a new design allows to easily
|
|
|
develop further. On top of that, most of ideas from faculty staff belongs to
|
|
|
second priority bucket, which will be implemented as part of the project. Most
|
|
|
advanced tasks from this category are advanced low-level evaluation
|
|
|
configuration format, using modern tools, connecting to a university systems and
|
|
|
merging separate system instances into single one. Other tasks are scheduled for
|
|
|
next releases after successful project defense. Namely, these are high-level
|
|
|
exercise evaluation configuration with user-friendly interface for common
|
|
|
exercise types, SIS integration (when some API will be available from their
|
|
|
side) and command-line submit tool. Plagiarism detection is not likely to be
|
|
|
part of any release in near future unless someone other makes the engine. The
|
|
|
detection problem is too hard to be solved as part of this project.
|
|
|
|
|
|
We named the project as **ReCodEx -- ReCodEx Code Examiner**. The name should
|
|
|
point to the old CodEx, but also reflect the new approach to solve issues.
|
|
|
**Re** as part of the name means redesigned, rewritten, renewed, or restarted.
|
|
|
|
|
|
At this point there is a clear idea how the new system will be used and what are
|
|
|
the major enhancements for future releases. With this in mind, the overall
|
|
|
architecture can be sketched. From the previous research, we set up several
|
|
|
goals, which the new system should have. They mostly reflect drawbacks of the
|
|
|
current version of CodEx and some reasonable wishes of university users. Most
|
|
|
notable features are following:
|
|
|
|
|
|
- modern HTML5 web frontend written in JavaScript using a suitable framework
|
|
|
- REST API implemented in PHP, communicating with database, evaluation backend
|
|
|
and a file server
|
|
|
- evaluation backend implemented as a distributed system on top of a message
|
|
|
queue framework (ZeroMQ) with master-worker architecture <!-- @todo: WTF is
|
|
|
worker??? The concept has not been introduced yet! -->
|
|
|
- worker with basic support of the Windows environment (without sandbox, no
|
|
|
general purpose suitable tool available yet)
|
|
|
- evaluation procedure configured in a YAML file, compound of small tasks
|
|
|
connected into an arbitrary oriented acyclic graph
|
|
|
|
|
|
|
|
|
## Solution concepts analysis
|
|
|
|
|
|
@todo: what problems were solved on abstract and high levels, how they can be solved and what was the final solution
|
|
|
|
|
|
- which problems are they? ... these ones below:
|
|
|
- what type of users there should be, why they are needed
|
|
|
- explain why there is exercise and assignment division, what means what and how they are used
|
|
|
- explain instances why they are useful what they solve and also discuss licenses concept
|
|
|
- groups, they can be public and private and why is that, what it solves,
|
|
|
explain and discuss threshold and other group features
|
|
|
- extended execution pipeline (not just compilation/execution/evaluation) and why it is needed
|
|
|
- progress state, how it can be done and displayed to user, why random messages
|
|
|
- how to display generally all outputs of executed programs to user (supervisor, student), what students can or cannot see and why
|
|
|
- judges, discuss what they possibly can do and what it can be used for (returning for instance 2 numbers instead of 1 and why we return just one)
|
|
|
- discuss points assigned to solution, why are there bonus points, explain minimal point threshold
|
|
|
- discuss several ways how points can be assigned to solution, propose basic systems but also general systems which can use outputs from judges or other executed programs, there is need for variables or other concept, explain why
|
|
|
- and many many more general concepts which can be discussed and solved... please append more of them if something comes to your mind... thanks
|
|
|
|
|
|
|
|
|
## Structure of the project
|
|
|
|
|
|
There are numerous ways how to divide some sort of system into separated
|
|
|
services, from one single component to many and many single-purpose components.
|
|
|
Having only one big service is not feasible, not scalable enough and mainly it
|
|
|
would be one big blob of code which somehow works and is very complex, so this
|
|
|
is not the way. The quite opposite, having a lot of single-purpose components is
|
|
|
also somehow impractical. It is scalable by default and all services would have
|
|
|
quite simple code but on the other hand communication requirements for such
|
|
|
solution would be insane. So there has to be chosen approach which is somehow in
|
|
|
the middle, that means services have to communicate in manner which will not
|
|
|
bring network down, code basis should be reasonable and the whole system has to
|
|
|
be scalable enough. With this being said there can be discussion over particular
|
|
|
division for ReCodEx system.
|
|
|
|
|
|
The ReCodEx project is divided into two logical parts – the *backend* and the
|
|
|
*frontend* – which interact which each other and which cover the whole area of
|
|
|
code examination. Both of these logical parts are independent of each other in
|
|
|
the sense of being installed on separate machines at different locations and
|
|
|
that one of the parts can be replaced with a different implementation and as
|
|
|
long as the communication protocols are preserved, the system will continue
|
|
|
working as expected.
|
|
|
|
|
|
*Backend* is the part which is responsible solely for the process of evaluation
|
|
|
a solution of an exercise. Each evaluation of a solution is referred to as a
|
|
|
*job*. For each job, the system expects a configuration document of the job,
|
|
|
supplementary files for the exercise (e.g., test inputs, expected outputs,
|
|
|
predefined header files), and the solution of the exercise (typically source
|
|
|
codes created by a student). There might be some specific requirements for the
|
|
|
job, such as a specific runtime environment, specific version of a compiler or
|
|
|
the job must be evaluated on a processor with a specific number of cores. The
|
|
|
backend infrastructure decides whether it will accept a job or decline it based
|
|
|
on the specified requirements. In case it accepts the job, it will be placed in
|
|
|
a queue and it will be processed as soon as possible. The backend publishes the
|
|
|
progress of processing of the queued jobs and the results of the evaluations can
|
|
|
be queried after the job processing is finished. The backend produces a log of
|
|
|
the evaluation and scores the solution based on the job configuration document.
|
|
|
|
|
|
From the scalable point of view there are two necessary components, the one
|
|
|
which will execute jobs and component which will distribute jobs to the
|
|
|
instances of the first one. This ensures scalability in manner of parallel
|
|
|
execution of numerous jobs which is exactly what is needed. Implementation of
|
|
|
these services are called **broker** and **worker**, first one handles
|
|
|
distribution, latter execution. These components should be enough to fulfill all
|
|
|
above said, but for the sake of simplicity and better communication gateways
|
|
|
with frontend two other components were added, **fileserver** and **monitor**.
|
|
|
Fileserver is simple component whose purpose is to store files which are
|
|
|
exchanged between frontend and backend. Monitor is also quite simple service
|
|
|
which is able to serve job progress state from worker to web application. These
|
|
|
two additional services are on the edge of frontend and backend (like gateways)
|
|
|
but logically they are more connected with backend, so it is considered they
|
|
|
belong there.
|
|
|
|
|
|
*Frontend* on the other hand is responsible for the communication with the users
|
|
|
and provides them a convenient access to the backend infrastructure. The
|
|
|
frontend manages user accounts and gathers them into units called groups. There
|
|
|
is a database of exercises which can be assigned to the groups and the users of
|
|
|
these groups can submit their solutions for these assignments. The frontend will
|
|
|
initiate evaluation of these solutions by the backend and it will store the
|
|
|
results afterwards. The results will be visible to authorized users and the
|
|
|
results will be awarded with points according to the score given by the Backend
|
|
|
in the evaluation process. The supervisors of the groups can edit the parameters
|
|
|
of the assignments, review the solutions and the evaluations in detail and award
|
|
|
the solutions with bonus points (both positive and negative) and discuss about
|
|
|
the solution with the author of the solution. Some of the users can be entitled
|
|
|
to create new exercises and extend the database of exercises which can be
|
|
|
assigned to the groups later on.
|
|
|
|
|
|
There are two main purposes of frontend -- holding the state of whole system
|
|
|
(database of users, exercises, solutions, points, etc.) and presenting the state
|
|
|
to users through some kind of an user interface (e.g., a web application, mobile
|
|
|
application, or a command-line tool). According to contemporary trends in
|
|
|
development of frontend parts of applications, we decided to split the frontend
|
|
|
in two logical parts -- a server side and a client side. The server side is
|
|
|
responsible for managing the state and the client side gives instructions to the
|
|
|
server side based on the inputs from the user. This decoupling gives us the
|
|
|
ability to create multiple client side tools which may address different needs
|
|
|
of the users.
|
|
|
|
|
|
The frontend developed as part of this project is a web application created with
|
|
|
the needs of the Faculty of Mathematics and Physics of the Charles university in
|
|
|
Prague in mind. The users are the students and their teachers, groups
|
|
|
correspond to the different courses, the teachers are the supervisors of these
|
|
|
groups. We believe that this model is applicable to the needs of other
|
|
|
universities, schools, and IT companies, which can use the same system for their
|
|
|
needs. It is also possible to develop their own frontend with their own user
|
|
|
management system for their specific needs and use the possibilities of the
|
|
|
Backend without any changes, as was mentioned in the previous paragraphs.
|
|
|
|
|
|
In the latter parts of the documentation, both of the backend and frontend parts
|
|
|
will be introduced separately and covered in more detail. The communication
|
|
|
protocol between these two logical parts will be described as well.
|
|
|
|
|
|
|
|
|
### Evaluation unit executed on backend
|
|
|
|
|
|
One of the bigger requests for the new system is to support a complex
|
|
|
configuration of execution pipeline. The idea comes from lecturers of Compiler
|
|
|
principles class who want to migrate their semi-manual evaluation process to
|
|
|
CodEx. Unfortunately, CodEx is not capable of such complicated exercise setup.
|
|
|
None of evaluation systems we found is can handle such task, so design from
|
|
|
scratch is needed.
|
|
|
|
|
|
There are two main approaches to design a complex execution configuration. It
|
|
|
can be composed of small amount of relatively big components or much more small
|
|
|
tasks. Big components are easy to write and whole configuration is reasonably
|
|
|
small. The components are designed for current problems, so it is not scalable
|
|
|
enough for pleasant future usage. This can be solved by introducing small set of
|
|
|
single-purposed tasks which can be composed together. The whole configuration is
|
|
|
then quite bigger, but with great adaptation ability for new conditions and also
|
|
|
less amount of work programming them. For better user experience, configuration
|
|
|
generators for some common cases can be introduced.
|
|
|
|
|
|
ReCodEx target is to be continuously developed and used for many years, so the
|
|
|
smaller tasks are the right choice. Observation of CodEx system shows that
|
|
|
only a few tasks are needed. In extreme case, only one task is enough -- execute
|
|
|
a binary. However, for better portability of configurations along different
|
|
|
systems it is better to implement reasonable subset of operations directly
|
|
|
without calling system provided binaries. These operations are copy file, create
|
|
|
new directory, extract archive and so on, altogether called internal tasks.
|
|
|
Another benefit from custom implementation of these tasks is guarantied safety,
|
|
|
so no sandbox needs to be used as in external tasks case.
|
|
|
|
|
|
For a job evaluation, the tasks needs to be executed sequentially in a specified
|
|
|
order. The idea of running independent tasks in parallel is bad because exact
|
|
|
time measurement needs controlled environment on target computer with
|
|
|
minimization of interrupts by other processes. It seems that connecting tasks
|
|
|
into directed acyclic graph (DAG) can handle all possible problem cases. None of
|
|
|
the authors, supervisors and involved faculty staff can think of a problem that
|
|
|
cannot be decomposed into tasks connected in a DAG. The goal of evaluation is
|
|
|
to satisfy as many tasks as possible. During execution there are sometimes
|
|
|
multiple choices of next task. To control that, each task can have a priority,
|
|
|
which is used as a secondary ordering criterion. For better understanding, here
|
|
|
is a small example.
|
|
|
|
|
|
![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png)
|
|
|
|
|
|
The _job root_ task is imaginary single starting point of each job. When the
|
|
|
_CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should
|
|
|
be deterministic by position in configuration file -- tasks stated earlier
|
|
|
should be executed earlier). The task priorities guaranties, that after
|
|
|
_CompileA_ task all dependent tasks are executed before _CompileB_ task (they
|
|
|
have higher priority number). For example this is useful to control which files
|
|
|
are present in a working directory at every moment. To sum up, there are 3
|
|
|
ordering criteria: dependencies, then priorities and finally position of task in
|
|
|
configuration. Together, they define a unambiguous linear ordering of all tasks.
|
|
|
|
|
|
For grading there are several important tasks. First, tasks executing submitted
|
|
|
code need to be checked for time and memory limits. Second, outputs of judging
|
|
|
tasks need to be checked for correctness (represented by return value or by data
|
|
|
on standard output) and should not fail on time or memory limits. This division
|
|
|
is transparent for backend, each task is executed the same way. But frontend
|
|
|
must know which tasks from whole job are important and what is their kind. It is
|
|
|
reasonable, to keep this piece of information alongside the tasks in job
|
|
|
configuration, so each task can have a label about its purpose. Unlabeled tasks
|
|
|
have an internal type _inner_. There are four categories of tasks:
|
|
|
|
|
|
- _initiation_ -- setting up the environment, compiling code, etc.; for users
|
|
|
failure means error in their sources which are not compatible with running it
|
|
|
with examination data
|
|
|
- _execution_ -- running the user code with examination data, must not exceed
|
|
|
time and memory limits; for users failure means wrong design, slow data
|
|
|
structures, etc.
|
|
|
- _evaluation_ -- comparing user and examination outputs; for user failure means
|
|
|
that the program does not compute the right results
|
|
|
- _inner_ -- no special meaning for frontend, technical tasks for fetching and
|
|
|
copying files, creating directories, etc.
|
|
|
|
|
|
Each job is composed of multiple tasks of these types which are semantically
|
|
|
grouped into tests. A test can represent one set of examination data for user
|
|
|
code. To mark the grouping, another task label can be used. Each test must have
|
|
|
exactly one _evaluation_ task (to show success or failure to users) and
|
|
|
arbitrary number of tasks with other types.
|
|
|
|
|
|
|
|
|
## Implementation analysis
|
|
|
|
|
|
When developing a project like ReCodEx there has to be some discussion over
|
|
|
implementation details and how to solve some particular problems properly.
|
|
|
This discussion is a never ending story which is done through the whole
|
|
|
development process. Some of the most important implementation problems or
|
|
|
interesting observations will be discussed in this chapter.
|
|
|
|
|
|
### Backend communication
|
|
|
|
|
|
@todo: what type of communication within backend could be used, mention some frameworks, queue managers, protocols, which was considered
|
|
|
|
|
|
### Broker
|
|
|
|
|
|
@todo: assigning of jobs to workers, which are possible algorithms, queues, which one was chosen
|
|
|
|
|
|
@todo: how can jobs be sent over zeromq, mainly mention that files can be transported, but it is not feasible
|
|
|
|
|
|
@todo: making action and reaction over zeromq more general and easily extensible, mention reactor and why is needed and what it solves
|
|
|
|
|
|
### Worker
|
|
|
|
|
|
Worker is component which is supposed to execute incoming jobs from broker. As
|
|
|
such worker should work and support wide range of different infrastructures and
|
|
|
maybe even platforms/operating systems. Support of at least two main operating
|
|
|
systems is desirable and should be implemented. Worker as a service does not
|
|
|
have to be much complicated, but a bit of complex behaviour is needed. Mentioned
|
|
|
complexity is almost exclusively concerned about robust communication with
|
|
|
broker which has to be regularly checked. Ping mechanism is usually used for
|
|
|
this in all kind of projects. This means that worker should be able to send ping
|
|
|
messages even during execution. So worker has to be divided into two separate
|
|
|
parts, the one which will handle communication with broker and the another which
|
|
|
will execute jobs. The easiest solution is to have these parts in separate
|
|
|
threads which somehow tightly communicates with each other. For inner process
|
|
|
communication there can be used numerous technologies, from shared memory to
|
|
|
condition variables or some kind of in-process messages. Already used library
|
|
|
ZeroMQ is possible to provide in-process messages working on the same principles
|
|
|
as network communication which is quite handy and solves problems with threads
|
|
|
synchronization and such.
|
|
|
|
|
|
At this point we have worker with two internal parts listening one and execution one. Implementation of first one is quite straighforward and clear. So lets discuss what should be happening in execution subsystem. Jobs as work units can quite vary and do completely different things, that means configuration and worker has to be prepared for this kind of generality. Configuration and its solution was already discussed above, implementation in worker is then quite also quite straightforward. Worker has internal structures to which loads and which stores metadata given in configuration. Whole job is mapped to job metadata structure and tasks are mapped to either external ones or internal ones (internal commands has to be defined within worker), both are different whether they are executed in sandbox or as internal worker commands.
|
|
|
|
|
|
Another division of tasks is by task-type field in configuration. This field can have four values: initiation, execution, evaluation and inner. All was discussed and described above in configuration analysis. What is important to worker is how to behave if execution of task with some particular type fails. There are two possible situations execution fails due to bad user solution or due to some internal error. If execution fails on internal error solution cannot be declared overly as failed. User should not be punished for bad configuration or some network error. This is where task types are useful. Generally initiation, execution and evaluation are tasks which are somehow executing code which was given by users who submitted solution of exercise. If this kinds of tasks fail it is probably connected with bad user solution and can be evaluated. But if some inner task fails solution should be re-executed, in best case scenario on different worker. That is why if inner task fails it is sent back to broker which will reassign job to another worker. More on this subject should be discussed in broker assigning algorithms section.
|
|
|
|
|
|
There is also question about working directory or directories of job, which directories should be used and what for. There is one simple answer on this every job will have only one specified directory which will contain every file with which worker will work in the scope of whole job execution. This is of course nonsense there has to be some logical division. The least which must be done are two folders one for internal temporary files and second one for evaluation. The directory for temporary files is enough to comprehend all kind of internal work with filesystem but only one directory for whole evaluation is somehow not enough. Users solutions are downloaded in form of zip archives so why these should be present during execution or why the results and files which should be uploaded back to fileserver should be cherry picked from the one big directory? The answer is of course another logical division into subfolders. The solution which was chosen at the end is to have folders for downloaded archive, decompressed solution, evaluation directory in which user solution is executed and then folders for temporary files and for results and generally files which should be uploaded back to fileserver with solution results. Of course there has to be hierarchy which separate folders from different workers on the same machines. That is why paths to directories are in format: ${DEFAULT}/${FOLDER}/${WORKER_ID}/${JOB_ID} where default means default working directory of whole worker, folder is particular directory for some purpose (archives, evaluation...). Mentioned division of job directories proved to be flexible and detailed enough, everything is in logical units and where it is supposed to be which means that searching through this system should be easy. In addition if solutions of users have access only to evaluation directory then they do not have access to unnecessary files which is better for overall security of whole ReCodEx.
|
|
|
|
|
|
As we discovered above worker has job directories but users who are writing and
|
|
|
managing job configurations do not know where they are (on some particular
|
|
|
worker) and how they can be accessed and written into configuration. For this
|
|
|
kind of task we have to introduce some kind of marks or signs which will
|
|
|
represent particular folders. Marks or signs can have form of some kind of
|
|
|
special strings which can be called variables. These variables then can be used
|
|
|
everywhere where filesystems paths are used within configuration file. This will
|
|
|
solve problem with specific worker environment and specific hierarchy of
|
|
|
directories. Final form of variables is ${...} where triple dot is textual
|
|
|
description. This format was used because of special dollar sign character which
|
|
|
cannot be used within filesystem path, braces are there only to border textual
|
|
|
description of variable.
|
|
|
|
|
|
#### Evaluation
|
|
|
|
|
|
After successful arrival of job, worker has to prepare new execution environment, then solution archive has to be downloaded from fileserver and extracted. Job configuration is located within these files and loaded into internal structures and executed. After that results are uploaded back to fileserver. These steps are the basic ones which are really necessary for whole execution and have to be executed in this precise order.
|
|
|
|
|
|
Interesting problem is with supplementary files (inputs, sample outputs). There are two approaches which can be observed. Supplementary files can be downloaded either on the start of the execution or during execution. If the files are downloaded at the beginning execution does not really started at this point and if there are problems with network worker find it right away and can abort execution without executing single task. Slight problems can arise if some of the files needs to have same name (e.g. solution assumes that input is `input.txt`), in this scenario downloaded files cannot be renamed at the beginning but during execution which is somehow impractical and not easily observed. Second solution of this problem when files are downloaded on the fly has quite opposite problem, if there are problems with network worker will find it during execution when for instance almost whole execution is done, this is also not ideal solution if we care about burnt hardware resources. On the other hand using this approach users have quite advanced control of execution flow and know what files exactly are available during execution which is from users perspective probably more appealing then the first solution. Based on that downloading of supplementary files using 'fetch' tasks during execution was chosen and implemented.
|
|
|
|
|
|
#### Caching mechanism
|
|
|
|
|
|
As described in fileserver section stored supplementary files have special
|
|
|
filenames which reflects hashes of their content. As such there are no
|
|
|
duplicates stored in fileserver. Worker can use feature too and caches these
|
|
|
files for some while and saves precious bandwidth. This means there has to be
|
|
|
system which can download file, store it in cache and after some time of
|
|
|
inactivity delete it. Because there can be multiple worker instances on some
|
|
|
particular server it is not efficient to have this system in every worker on its
|
|
|
own. So it is feasible to have this feature somehow shared among all workers on
|
|
|
the same machine. Solution may be again having separate service connected
|
|
|
through network with workers which would provide such functionality but this
|
|
|
would mean component with another communication for the purpose where it is not
|
|
|
exactly needed. But mainly it would be single-failure component if it would stop
|
|
|
working it is quite problem. So there was chosen another solution which assumes
|
|
|
worker has access to specified cache folder, to this folder worker can download
|
|
|
supplementary files and copy them from here. This means every worker has the
|
|
|
\possibility to maintain downloads to cache, but what is worker not able to
|
|
|
properly do is deletion of unused files after some time. For that single-purpose
|
|
|
component is introduced which is called 'cleaner'. It is simple script executed
|
|
|
within cron which is able to delete files which were unused for some time.
|
|
|
Together with worker fetching feature cleaner completes machine specific caching
|
|
|
system.
|
|
|
|
|
|
Cleaner as mentioned is simple script which is executed regularly as cron job. If there is caching system like it was introduced in paragraph above there are little possibilities how cleaner should be implemented. On various filesystems there is usually support for two particular timestamps, `last access time` and `last modification time`. Files in cache are once downloaded and then just copied, this means that last modification time is set only once on creation of file and last access time should be set every time on copy. This imply last access time is what is needed here. But last modification time is widely used by operating systems, on the other hand last access time is not by default. More on this subject can be found [here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime). For proper cleaner functionality filesystem which is used by worker for caching has to have last access time for files enabled.
|
|
|
|
|
|
Having cleaner as separated component and caching itself handled in worker is
|
|
|
kind of blurry and is not clearly observable that it works without any race
|
|
|
conditions. The goal here is not to have system without races but to have system
|
|
|
which can recover from them. Implementation of caching system is based upon
|
|
|
atomic operations of underlying filesystem. Follows description of one possible
|
|
|
robust implementation. First start with worker implementation:
|
|
|
|
|
|
- worker discovers fetch task which should download supplementary file
|
|
|
- worker takes name of file and tries to copy it from cache folder to its
|
|
|
working folder
|
|
|
- if successful then last access time should be rewritten (by filesystem
|
|
|
itself) and whole operation is done
|
|
|
- if not successful then file has to be downloaded
|
|
|
- file is downloaded from fileserver to working folder
|
|
|
- downloaded file is then copied to cache
|
|
|
|
|
|
Previous implementation is only within worker, cleaner can anytime intervene and
|
|
|
delete files. Implementation in cleaner follows:
|
|
|
|
|
|
- cleaner on its start stores current reference timestamp which will be used for
|
|
|
comparison and load configuration values of caching folder and maximal file
|
|
|
age
|
|
|
- there is a loop going through all files and even directories in specified
|
|
|
cache folder
|
|
|
- last access time of file or folder is detected
|
|
|
- last access time is subtracted from reference timestamp into difference
|
|
|
- difference is compared against specified maximal file age, if difference
|
|
|
is greater, file or folder is deleted
|
|
|
|
|
|
Previous description implies that there is gap between detection of last access time and deleting file within cleaner. In the gap there can be worker which will access file and the file is anyway deleted but this is fine, file is deleted but worker has it copied. Another problem can be with two workers downloading the same file, but this is also not a problem file is firstly downloaded to working folder and after that copied to cache. And even if something else unexpectedly fails and because of that fetch task will fail during execution even that should be fine. Because fetch tasks should have 'inner' task type which implies that fail in this task will stop all execution and job will be reassigned to another worker. It should be like the last salvation in case everything else goes wrong.
|
|
|
|
|
|
#### Sandboxing
|
|
|
|
|
|
There are numerous ways how to approach sandboxing on different platforms,
|
|
|
describing all possible approaches is out of scope of this document. Instead of
|
|
|
that have a look at some of the features which are certainly needed for ReCodEx
|
|
|
and propose some particular sandboxes implementations on linux or Windows.
|
|
|
|
|
|
General purpose of sandbox is safely execute software in any form, from scripts to binaries. Various sandboxes differ in how safely are they and what limiting features they have. Ideal situation is that sandbox will have numerous options and corresponding features which will allow administrators to setup environment as they like and which will not allow user programs to somehow damage executing machine in any way possible.
|
|
|
|
|
|
For ReCodEx and its evaluation there is need for at least these features: execution time and memory limitation, disk operations limit, disk accessibility restrictions and network restrictions. All these features if combined and implemented well are giving pretty safe sandbox which can be used for all kinds of users solutions and should be able to restrict and stop any standard way of attacks or errors.
|
|
|
|
|
|
Linux systems have quite extent support of sandboxing in kernel, there were introduced and implemented kernel namespaces and cgroups which combined can limit hardware resources (cpu, memory) and separate executing program into its own namespace (pid, network). These two features comply sandbox requirement for ReCodEx so there were two options, either find existing solution or implement new one. Luckily existing solution was found and its name is **isolate**. Isolate does not use all possible kernel features but only subset which is still enough to be used by ReCodEx.
|
|
|
|
|
|
The opposite situation is in Windows world, there is limited support in its kernel which makes sandboxing a bit trickier. Windows kernel only has ways how to restrict privileges of a process through restriction of internal access tokens. Monitoring of hardware resources is not possible but used resources can be obtained through newly created job objects. But find sandbox which can do all things needed for ReCodEx seems to be impossible. There are numerous sandboxes for Windows but they all are focused on different things in a lot of cases they serves as safe environment for malicious programs, viruses in particular. Or they are designed as a separate filesystem namespace for installing a lot of temporarily used programs. From all these we can mention Sandboxie, Comodo Internet Security, Cuckoo sandbox and many others. None of these is fitted as sandbox solution for ReCodEx. With this being said we can safely state that designing and implementing new general sandbox for Windows is out of scope of this project.
|
|
|
|
|
|
New general sandbox for Windows is out of bussiness but what about more specialized solution used for instance only for C#. CLR as a virtual machine and runtime environment has a pretty good security support for restrictions and separation which is also transfered to C#. This makes it quite easy to implement simple sandbox within C# but suprisingly there cannot be found some well known general purpose implementations. As said in previous paragraph implementing our own solution is out of scope of project there is simple not enough time. But C# sandbox is quite good topic for another project for example semestral project for C# course so it might be written and integrated in future.
|
|
|
|
|
|
### Fileserver
|
|
|
|
|
|
@todo: fileserver and why is separated
|
|
|
|
|
|
@todo: mention hashing on fileserver and why this approach was chosen
|
|
|
|
|
|
@todo: what can be stored on fileserver
|
|
|
|
|
|
@todo: how can jobs be stored on fileserver, mainly mention that it is nonsence to store inputs and outputs within job archive
|
|
|
|
|
|
### Monitor
|
|
|
|
|
|
Users want to view real time evaluation progress of their solution. It can be
|
|
|
easily done with established double-sided connection stream, but it is hard to
|
|
|
achieve with web technologies. HTTP protocol works differently on separate
|
|
|
requests basis with no longterm connection. However, there is widely used
|
|
|
technology to solve this problem, WebSocket protocol.
|
|
|
|
|
|
Working with WebSocket protocol from the backend is possible, but not ideal from
|
|
|
design point of view. Backend should be hidden from public internet to minimize
|
|
|
surface for possible attacks. With this in mind, there are two possible options:
|
|
|
|
|
|
- send progress messages through API
|
|
|
- make separate component for progress messages
|
|
|
|
|
|
Each of the two possibilities has some pros and cons. The first one is good
|
|
|
because there is no additional component and API is already publicly visible. On
|
|
|
the other side, working with WebSocket protocol from PHP is not much pleasant
|
|
|
(but it is possible) and embedding this functionality into API is not
|
|
|
extendable. The second approach is better for future changing the protocol or
|
|
|
implementing extensions like caching of messages. Also, the progress feature is
|
|
|
considered only optional, because there may be clients for which this feature is
|
|
|
useless. Major drawback of separate component is another part, which needs to
|
|
|
be publicly exposed.
|
|
|
|
|
|
We decided to make a separate component, mainly because it is smaller component
|
|
|
with only one role, better maintainability and optional demands for progress
|
|
|
callback.
|
|
|
|
|
|
There are several possibilities how to write the component. Notably, considered
|
|
|
options were already used languages C++, PHP, JavaScript and Python. At the end,
|
|
|
the Python language was chosen for its simplicity, great support for all used
|
|
|
technologies and also there are free Python developers in out team. Then,
|
|
|
responsibility of this component is determined. Concept of message flow is on
|
|
|
following picture.
|
|
|
|
|
|
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
|
|
|
|
|
|
The message channel inputing the monitor uses ZeroMQ as main message framework
|
|
|
used by backend. This decision keeps rest of backend avare of used
|
|
|
communication protocol and related libraries. Output channel is WebSocket as a
|
|
|
protocol for sending messages to web browsers. In Python, there are several
|
|
|
WebSocket libraries. The most popular one is `websockets` in cooperation with
|
|
|
`asyncio`. This combination is easy to use and well documented, so it is used in
|
|
|
monitor component too. For ZeroMQ, there is `zmq` library with binding to
|
|
|
framework core in C++.
|
|
|
|
|
|
Incoming messages are cached for short period of time. Early testing shows,
|
|
|
that backend can start sending progress messages sooner than client connects to
|
|
|
the monitor. To solve this, messages for each job are hold 5 minutes after
|
|
|
reception of last message. The client gets all already received messages at time
|
|
|
of connection with no message loss.
|
|
|
|
|
|
|
|
|
### Frontend communication
|
|
|
|
|
|
The first thing we need to address is the communication protocol of this
|
|
|
client-server architecture. There are several options:
|
|
|
|
|
|
- *TCP sockets* -- TCP sockets give a reliable means of a full-duplex
|
|
|
communication. All major operating systems support this protocol and there are
|
|
|
libraries which simplify the implementation. On the other side, it is not
|
|
|
possible to initiate a TCP socket from a web browser.
|
|
|
- *WebSockets* -- The WebSocket standard is built on top of TCP. It enables a
|
|
|
web browser to connect to a server over a TCP socket. WebSockets are
|
|
|
implemented in recent versions of all modern web browsers and there are
|
|
|
libraries for several programming languages like Python or JavaScript (running
|
|
|
in Node.js). Encryption of the communication over a WebSocket is supported as
|
|
|
a standard.
|
|
|
- *HTTP protocol* -- The HTTP protocol is a state-less protocol implemented on
|
|
|
top of the TCP protocol. The communication between the client and server
|
|
|
consists of a requests sent by the client and responses to these requests sent
|
|
|
back by the sever. The client can send as many requests as needed and it may
|
|
|
ignore the responses from the server, but the server must respond only to the
|
|
|
requests of the client and it cannot initiate communication on its own.
|
|
|
End-to-end encryption can be achieved easily using SSL (HTTPS).
|
|
|
|
|
|
We chose the HTTP(S) protocol because of the simple implementation in all sorts
|
|
|
of operating systems and runtime environments on both the client and the server
|
|
|
side.
|
|
|
|
|
|
The API of the server should expose basic CRUD (Create, Read, Update, Delete)
|
|
|
operations. There are some options on what kind of messages to send over the
|
|
|
HTTP:
|
|
|
|
|
|
- SOAP -- a protocol for exchanging XML messages. It is very robust and complex.
|
|
|
- REST -- is a stateless architecture style, not a protocol or a technology. It
|
|
|
relies on HTTP (but not necessarily) and its method verbs (e.g., GET, POST,
|
|
|
PUT, DELETE). It can fully implement the CRUD operations.
|
|
|
|
|
|
Even though there are some other technologies we chose the REST style over the
|
|
|
HTTP protocolo. It is widely used, there are many tools available for
|
|
|
development and testing, and it is understood by programmers so it should be
|
|
|
easy for a new developer with some experience in client-side applications to get
|
|
|
to know with the ReCodEx API and develop a client application.
|
|
|
|
|
|
### API server
|
|
|
|
|
|
The API server must handle HTTP requests and manage the state of the application
|
|
|
in some kind of a database. It must also be able to communicate with the
|
|
|
backend over ZeroMQ.
|
|
|
|
|
|
We considered several technologies which could be used:
|
|
|
|
|
|
- PHP + Apache -- one of the most widely used technologies for creating web
|
|
|
servers. It is a suitable technology for this kind of a project. It has all
|
|
|
the features we need when some additional extensions are installed (to support
|
|
|
LDAP or ZeroMQ).
|
|
|
- ASP.NET (C#), JSP (Java) -- these technologies are very robust and are used to
|
|
|
create server technologies in many big enterpises. Both can run on Windows and
|
|
|
Linux servers (ASP.NET using the .NET Core).
|
|
|
- JavaScript (Node.js) -- it is a quite new technology and it is being used to
|
|
|
create REST APIs lately. Applications running on Node.js are quite performant
|
|
|
and the number of open-source libraries available on the Internet is very
|
|
|
huge.
|
|
|
|
|
|
We chose PHP and Apache mainly because we were familiar with these technologies
|
|
|
and we were able to develop all the features we needed without learning to use a
|
|
|
new technology. Since the number of features was quite high and needed to meet a
|
|
|
strict deadline. This does not mean that we would find all the other
|
|
|
technologies superior to PHP in all other aspects - PHP 7 is a mature language
|
|
|
with a huge comunity and a wide range of tools, libraries, and frameworks.
|
|
|
|
|
|
We decided to use an ORM framework to manage the database, namely the widely
|
|
|
used PHP ORM Doctrine 2. This framework has a robust abstraction layer DBAL so
|
|
|
the database engine is not very important and it can be changed without any need
|
|
|
for changing the code. We chose an open-source database MariaDB.
|
|
|
|
|
|
To speed up the development process of the PHP server application we decided to
|
|
|
use an MVC framework. After evaluating and trying several frameworks, such as
|
|
|
Lumen, Laravel, and Symfony, we ended up using the framework Nette. This
|
|
|
framework is very common in the Czech Republic -- its main developer is a
|
|
|
well-known Czech programmer David Grudl -- and we were already familiar with the
|
|
|
patterns used in this framework (e.g., dependency injection, authentication,
|
|
|
routing). There is a good extension for the Nette framework which makes usage of
|
|
|
Doctrine 2 very straightforward.
|
|
|
|
|
|
@todo: what database can be used, how it is mapped and used within code
|
|
|
|
|
|
@todo: authentication, some possibilities and describe used jwt
|
|
|
|
|
|
@todo: solution of forgotten password, why this in particular
|
|
|
|
|
|
@todo: rest api is used for report of backend state and errors, describe why and other possibilities (separate component)
|
|
|
|
|
|
@todo: what files are stored in api, why there are duplicates among api and fileserver
|
|
|
|
|
|
@todo: why are there instances and for which they can be used for, describe licences and its implementation
|
|
|
|
|
|
@todo: groups and hierarchy, describe arbitrary nesting which should be possible within instance and how it is implemented and how it could be implemented
|
|
|
|
|
|
@todo: where is stored which workers can be used by supervisors and which runtimes are available, describe possibilities and why is not implemented automatic solution
|
|
|
|
|
|
@todo: on demand loading of students submission, in-time loading of every other submission, why
|
|
|
|
|
|
### Web-app
|
|
|
|
|
|
@todo: what technologies can be used on client frontend side, why react was used
|
|
|
|
|
|
@todo: please think about more stuff about api and web-app... thanks ;-)
|
|
|
|
|
|
|
|
|
|
|
|
# The Backend
|
|
|
|
|
|
The backend is the part which is hidden to the user and which has only
|
|
|
one purpose: evaluate user’s solutions of their assignments.
|
|
|
|
|
|
@todo: describe the configuration inputs of the Backend
|
|
|
|
|
|
@todo: describe the outputs of the Backend
|
|
|
|
|
|
@todo: describe how the backend receives the inputs and how it
|
|
|
communicates the results
|
|
|
|
|
|
## Components
|
|
|
|
|
|
Whole backend is not just one service/component, it is quite complex system on its own.
|
|
|
|
|
|
@todo: describe the inner parts of the Backend (and refer to the Wiki
|
|
|
for the technical description of the components)
|
|
|
|
|
|
### Broker
|
|
|
|
|
|
@todo: gets stuff done, single point of failure and center point of ReCodEx universe
|
|
|
|
|
|
### Fileserver
|
|
|
|
|
|
@todo: stores particular data from frontend and backend, hashing, HTTP API
|
|
|
|
|
|
### Worker
|
|
|
|
|
|
@todo: describe a bit of internal structure in general
|
|
|
|
|
|
@todo: describe how jobs are generally executed
|
|
|
|
|
|
### Monitor
|
|
|
|
|
|
@todo: not necessary component which can be ommited, proxy-like service
|
|
|
|
|
|
## Backend internal communication
|
|
|
|
|
|
@todo: internal backend communication, what communicates with what and why
|
|
|
|
|
|
The Frontend
|
|
|
============
|
|
|
|
|
|
The frontend is the part which is visible to the user of ReCodEx and
|
|
|
which holds the state of the system – the user accounts, their roles in
|
|
|
the system, the database of exercises, the assignments of these
|
|
|
exercises to groups of users (i.e., students), and the solutions and
|
|
|
evaluations of them.
|
|
|
|
|
|
Frontend is split into three parts:
|
|
|
|
|
|
- the server-side REST API (“API”) which holds the business logic and
|
|
|
keeps the state of the system consistent
|
|
|
|
|
|
- the relational database (“DB”) which persists the state of the
|
|
|
system
|
|
|
|
|
|
- the client side application (“client”) which simplifies access to
|
|
|
the API for the common users
|
|
|
|
|
|
The centerpiece of this architecture is the API. This component receives
|
|
|
requests from the users and from the Backend, validates them and
|
|
|
modifies the state of the system and persists this modified state in the
|
|
|
DB.
|
|
|
|
|
|
We have created a web application which can communicate with the API
|
|
|
server and present the information received from the server to the user
|
|
|
in a convenient way. The client can be though any application, which can
|
|
|
send HTTP requests and receive the HTTP responses. Users can use general
|
|
|
applications like [cURL](https://github.com/curl/curl/),
|
|
|
[Postman](https://www.getpostman.com/), or create their own specific
|
|
|
client for ReCodEx API.
|
|
|
|
|
|
Frontend capabilities
|
|
|
---------------------
|
|
|
|
|
|
@todo: describe what the frontend is capable of and how it really works,
|
|
|
what are the limitations and how it can be extended
|
|
|
|
|
|
Terminology
|
|
|
-----------
|
|
|
|
|
|
This project was created for the needs of a university and this fact is
|
|
|
reflected into the terminology used throughout the Frontend. A list of
|
|
|
important terms’ definitions follows to make the meaning unambiguous.
|
|
|
|
|
|
### User and user roles
|
|
|
|
|
|
*User* is a person who uses the application. User is granted access to
|
|
|
the application once he or she creates an account directly through the
|
|
|
API or the web application. There are several types of user accounts
|
|
|
depending on the set of permissions – a so called “role” – they have
|
|
|
been granted. Each user receives only the most basic set of permissions
|
|
|
after he or she creates an account and this role can be changed only by
|
|
|
the administrators of the service:
|
|
|
|
|
|
- *Student* is the most basic role. Student can become member of a
|
|
|
group and submit his solutions to his assignments.
|
|
|
|
|
|
- *Supervisor* can be entitled to manage a group of students.
|
|
|
Supervisor can assign exercises to the students who are members of
|
|
|
his groups and review their solutions submitted to
|
|
|
these assignments.
|
|
|
|
|
|
- *Super-admin* is a user with unlimited rights. This user can perform
|
|
|
any action in the system.
|
|
|
|
|
|
There are two implicit changes of roles:
|
|
|
|
|
|
- Once a *student* is added to a group as its supervisor, his role is
|
|
|
upgraded to a *supervisor* role.
|
|
|
|
|
|
- Once a *supervisor* is removed from the lasts group where he is a
|
|
|
supervisor then his role is downgraded to a *student* role.
|
|
|
|
|
|
These mechanisms do not prevent a single user being a supervisor of one
|
|
|
group and student of a different group as supervisors’ permissions are
|
|
|
superset of students’ permissions.
|
|
|
|
|
|
### Login
|
|
|
|
|
|
*Login* is a set of user’s credentials he must submit to verify he can
|
|
|
be allowed to access the system as a specific user. We distinguish two
|
|
|
types of logins: local and external.
|
|
|
|
|
|
- *Local login* is user’s email address and a password he chooses
|
|
|
during registration.
|
|
|
|
|
|
- *External login* is a mapping of a user profile to an account of
|
|
|
some authentication service (e.g., [CAS](https://ldap1.cuni.cz/)).
|
|
|
|
|
|
### Instance
|
|
|
|
|
|
*An instance* of ReCodEx is in fact just a set of groups and user
|
|
|
accounts. An instance should correspond to a real entity as a
|
|
|
university, a high-school, an IT company or an HR agency. This approach
|
|
|
enables the system to be shared by multiple independent organizations
|
|
|
without interfering with each other.
|
|
|
|
|
|
Usage of the system by the users of an instance can be limited by
|
|
|
possessing a valid license. It is up to the administrators of the system
|
|
|
to determine the conditions under which they will assign licenses to the
|
|
|
instances.
|
|
|
|
|
|
### Group
|
|
|
|
|
|
*Group* corresponds to a school class or some other unit which gathers
|
|
|
users who will be assigned the same set exercises. Each group can have
|
|
|
multiple supervisors who can manage the students and the list of
|
|
|
assignments.
|
|
|
|
|
|
Groups can form a tree hierarchy of arbitrary depth. This is inspired by the
|
|
|
hierarchy of school classes belonging to the same subject over several school
|
|
|
years. For example, there can be a top level group for a programming class that
|
|
|
contains subgroups for every school year. These groups can then by divided into
|
|
|
actual student groups with respect to lab attendance. Supervisors can create
|
|
|
subgroups of their groups and further manage these subgroups.
|
|
|
|
|
|
### Exercise
|
|
|
|
|
|
*An exercise* consists of textual assignment of a task and a definition
|
|
|
of how a solution to this exercise should be processed and evaluated in
|
|
|
a specific runtime environment (i.e., how to compile a submitted source
|
|
|
code and how to test the correctness of the program). It is a template
|
|
|
which can be instantiated as an *assignment* by a supervisor of a group.
|
|
|
|
|
|
### Assignment
|
|
|
|
|
|
An assignment is an instance of an *exercise* assigned to a specific
|
|
|
*group*. An assignment can modify the text of the task assignment and it
|
|
|
has some additional information which is specific to the group (e.g., a
|
|
|
deadline, the number of points gained for a correct solution, additional
|
|
|
hints for the students in the assignment). The text of the assignment
|
|
|
can be edited and supervisors can translate the assignment into another
|
|
|
language.
|
|
|
|
|
|
### Solution
|
|
|
|
|
|
*A solution* is a set of files which a user submits to a given
|
|
|
*assignment*.
|
|
|
|
|
|
### Submission
|
|
|
|
|
|
*A submission* corresponds to a *solution* being evaluated by the
|
|
|
Backend. A single *solution* can be submitted repeatedly (e.g., when the
|
|
|
Backend encounters an error or when the supervisor changes the assignment).
|
|
|
|
|
|
### Evaluation
|
|
|
|
|
|
*An evaluation* is the processed report received from the Backend after
|
|
|
a *submission* is processed. Evaluation contains points given to the
|
|
|
user based on the quality of his solution measured by the Backend and
|
|
|
the settings of the assignment. Supervisors can review the evaluation
|
|
|
and add bonus points (both positive and negative) if the student
|
|
|
deserves some.
|
|
|
|
|
|
### Runtime environment
|
|
|
|
|
|
*A runtime environment* defines the used programming language or tools
|
|
|
which are needed to process and evaluate a solution. Examples of a
|
|
|
runtime environment can be:
|
|
|
|
|
|
- *Linux + GCC*
|
|
|
- *Linux + Mono*
|
|
|
- *Windows + .NET 4*
|
|
|
- *Bison + Yacc*
|
|
|
|
|
|
### Limits
|
|
|
|
|
|
A correct *solution* of an *assignment* has to pass all specified tests (mostly
|
|
|
checks that it yields the correct output for various inputs) and typically must
|
|
|
also be effective in some sense. The Backend measures the time and memory
|
|
|
consumption of the solution while running. This consumption of resources can be
|
|
|
*limited* and the solution will receive fewer points if it exceeds the given
|
|
|
limits in some test cases defined by the *exercise*.
|
|
|
|
|
|
User management
|
|
|
---------------
|
|
|
|
|
|
@todo: roles and their rights, adding/removing different users, how the
|
|
|
role of a specific user changes
|
|
|
|
|
|
Instances and hierarchy of groups
|
|
|
---------------------------------
|
|
|
|
|
|
@todo: What is an instance, how to create one, what are the licenses and
|
|
|
how do they work. Why can the groups form hierarchies and what are the
|
|
|
benefits – what it means to be an admin of a group, hierarchy of roles
|
|
|
in the group hierarchy.
|
|
|
|
|
|
Exercises database
|
|
|
------------------
|
|
|
|
|
|
@todo: How the exercises are stored, accessed, who can edit what
|
|
|
|
|
|
### Creating a new exercise
|
|
|
|
|
|
@todo Localized assignments, default settings
|
|
|
|
|
|
### Runtime environments and hardware groups
|
|
|
|
|
|
@todo read this later and see if it still makes sense
|
|
|
|
|
|
ReCodEx is designed to utilize a rather diverse set of workers -- there can be
|
|
|
differences in many aspects, such as the actual hardware running the worker
|
|
|
(which impacts the results of measuring) or installed compilers, interpreters
|
|
|
and other tools needed for evaluation. To address these two examples in
|
|
|
particular, we assign runtime environments and hardware groups to exercises.
|
|
|
|
|
|
The purpose of runtime environments is to specify which tools (and often also
|
|
|
operating system) are required to evaluate a solution of the exercise -- for
|
|
|
example, a C# programming exercise can be evaluated on a Linux worker running
|
|
|
Mono or a Windows worker with the .NET runtime. Such exercise would be assigned
|
|
|
two runtime environments, `Linux+Mono` and `Windows+.NET` (the environment names
|
|
|
are arbitrary strings configured by the administrator).
|
|
|
|
|
|
A hardware group is a set of workers that run on similar hardware (e.g. a
|
|
|
particular quad-core processor model and a SSD hard drive). Workers are assigned
|
|
|
to these groups by the administrator. If this is done correctly, performance
|
|
|
measurements of a submission should yield the same results. Thanks to this fact,
|
|
|
we can use the same resource limits on every worker in a hardware group.
|
|
|
However, limits can differ between runtime environments -- formally speaking,
|
|
|
limits are a function of three arguments: an assignment, a hardware group and a
|
|
|
runtime environment.
|
|
|
|
|
|
### Reference solutions
|
|
|
|
|
|
@todo: how to add one, how to evaluate it
|
|
|
|
|
|
The task of determining appropriate resource limits for exercises is difficult
|
|
|
to do correctly. To aid exercise authors and group supervisors, ReCodEx supports
|
|
|
assigning reference solutions to exercises. Those are example programs that
|
|
|
should cover the main approaches to the implementation. For example, searching
|
|
|
for an integer in an ordered array can be done with a linear search, or better,
|
|
|
using a binary search.
|
|
|
|
|
|
Reference solutions can be evaluated on demand, using a selected hardware group.
|
|
|
The evaluation results are stored and can be used later to determine limits. In
|
|
|
our example problem, we could configure the limits so that the linear
|
|
|
search-based program doesn't finish in time on larger inputs, but a binary
|
|
|
search does.
|
|
|
|
|
|
Note that separate reference solutions should be supplied for all supported
|
|
|
runtime environments.
|
|
|
|
|
|
### Exercise assignments
|
|
|
|
|
|
@todo: Creating instances of an exercise for a specific group of users,
|
|
|
capabilities of settings. Editing limits according to the reference
|
|
|
solution.
|
|
|
|
|
|
Evaluation process
|
|
|
------------------
|
|
|
|
|
|
@todo: How the evaluation process works on the Frontend side.
|
|
|
|
|
|
### Uploading files and file storage
|
|
|
|
|
|
@todo: One by one upload endpoint. Explain different types of the
|
|
|
Uploaded files.
|
|
|
|
|
|
### Automatic detection of the runtime environment
|
|
|
|
|
|
@todo: Users must submit correctly named files – assuming the RTE from
|
|
|
the extensions.
|
|
|
|
|
|
REST API implementation
|
|
|
-----------------------
|
|
|
|
|
|
@todo: What is the REST API, what are the basic principles – GET, POST,
|
|
|
Headers, JSON.
|
|
|
|
|
|
### Authentication and authorization scopes
|
|
|
|
|
|
@todo: How authentication works – signed JWT, headers, expiration,
|
|
|
refreshing. Token scopes usage.
|
|
|
|
|
|
### HTTP requests handling
|
|
|
|
|
|
@todo: Router and routes with specific HTTP methods, preflight, required
|
|
|
headers
|
|
|
|
|
|
### HTTP responses format
|
|
|
|
|
|
@todo: Describe the JSON structure convention of success and error
|
|
|
responses
|
|
|
|
|
|
### Used technologies
|
|
|
|
|
|
@todo: PHP7 – how it is used for typehints, Nette framework – how it is
|
|
|
used for routing, Presenters actions endpoints, exceptions and
|
|
|
ErrorPresenter, Doctrine 2 – database abstraction, entities and
|
|
|
repositories + conventions, Communication over ZMQ – describe the
|
|
|
problem with the extension and how we reported it and how to treat it in
|
|
|
the future when the bug is solved. Relational database – we use MariaDB,
|
|
|
Doctine enables us to switch the engine to a different engine if needed
|
|
|
|
|
|
### Data model
|
|
|
|
|
|
@todo: Describe the code-first approach using the Doctrine entities, how
|
|
|
the entities map onto the database schema (refer to the attached schemas
|
|
|
of entities and relational database models), describe the logical
|
|
|
grouping of entities and how they are related:
|
|
|
|
|
|
- user + settings + logins + ACL
|
|
|
- instance + licenses + groups + group membership
|
|
|
- exercise + assignments + localized assignments + runtime
|
|
|
environments + hardware groups
|
|
|
- submission + solution + reference solution + solution evaluation
|
|
|
- comment threads + comments
|
|
|
|
|
|
### API endpoints
|
|
|
|
|
|
@todo: Tell the user about the generated API reference and how the
|
|
|
Swagger UI can be used to access the API directly.
|
|
|
|
|
|
Web Application
|
|
|
---------------
|
|
|
|
|
|
@todo: What is the purpose of the web application and how it interacts
|
|
|
with the REST API.
|
|
|
|
|
|
### Used technologies
|
|
|
|
|
|
@todo: Briefly introduce the used technologies like React, Redux and the
|
|
|
build process. For further details refer to the GitHub wiki
|
|
|
|
|
|
### How to use the application
|
|
|
|
|
|
@todo: Describe the user documentation and the FAQ page.
|
|
|
|
|
|
Backend-Frontend communication protocol
|
|
|
=======================================
|
|
|
|
|
|
@todo: describe the exact methods and respective commands for the
|
|
|
communication
|
|
|
|
|
|
Initiation of a job evaluation
|
|
|
------------------------------
|
|
|
|
|
|
@todo: How does the Frontend initiate the evaluation and how the Backend
|
|
|
can accept it or decline it
|
|
|
|
|
|
Job processing progress monitoring
|
|
|
----------------------------------
|
|
|
|
|
|
When evaluating a job the worker sends progress messages on predefined points of
|
|
|
evaluation chain. The sending place can be on very beginning of the job, when
|
|
|
submit archive is downloaded or at the end of each simple task with its state
|
|
|
(completed, failed, skipped). These messages are sent to broker through existing
|
|
|
ZeroMQ connection. Detailed format of messages can be found on [communication
|
|
|
page](https://github.com/ReCodEx/wiki/wiki/Overall-architecture#commands-from-worker-to-broker).
|
|
|
|
|
|
Broker only resends received progress messages to the monitor component via
|
|
|
ZeroMQ socket. The output message format is the same as the input format.
|
|
|
|
|
|
Monitor parses received messages to JSON format, which is easy to work with in
|
|
|
JavaScript inside web application. All messages are cached (one queue per job)
|
|
|
and can be obtained multiple times through WebSocket communication channel. The
|
|
|
cache is cleared 5 minutes after receiving last message.
|
|
|
|
|
|
Publishing of the results
|
|
|
-------------------------
|
|
|
|
|
|
After job finish the worker packs results directory into single archive and
|
|
|
uploads it to the fileserver through HTTP protocol. The target URL is obtained
|
|
|
from API in headers on job initiation. Then "job done" notification request is
|
|
|
performed to API via broker. Special submissions (reference or asynchronous
|
|
|
submissions) are loaded immediately, other types are loaded on-demand on first
|
|
|
results request.
|
|
|
|
|
|
Loading results means fetching archive from fileserver, parsing the main YAML
|
|
|
file generated by worker and saving data to the database. Also, points are
|
|
|
assigned by score calculator.
|
|
|
|
|
|
User documentation
|
|
|
==================
|
|
|
|
|
|
Web Application
|
|
|
---------------
|
|
|
|
|
|
@todo: Describe different scenarios of the usage of the Web App
|
|
|
|
|
|
### Terminology
|
|
|
|
|
|
@todo: Describe the terminology: Instance, User, Group, Student,
|
|
|
Supervisor, Admin
|
|
|
|
|
|
### Web application requirements
|
|
|
|
|
|
@todo: Describe the requirements of running the web application (modern
|
|
|
web browser, enabled CSS, JavaScript, Cookies & Local storage)
|
|
|
|
|
|
|
|
|
### Scenario \#1: Becoming a user of ReCodEx
|
|
|
|
|
|
#### How to create a user account?
|
|
|
|
|
|
You can create an account if you click on the “*Create account*” menu
|
|
|
item in the left sidebar. You can choose between two types of
|
|
|
registration methods – by creating a local account with a specific
|
|
|
password, or pairing your new account with an existing CAS UK account.
|
|
|
|
|
|
If you decide a new “*local*” account using the “*Create ReCodEx
|
|
|
account*” form, you will have to provide your details and choose a
|
|
|
password for your account. You will later sign in using your email
|
|
|
address as your username and the password you select.
|
|
|
|
|
|
If you decide to use the CAS UK, then we will verify your credentials
|
|
|
and access your name and email stored in the system and create your
|
|
|
account based on this information. You can change your personal
|
|
|
information or email later on the “*Settings*” page.
|
|
|
|
|
|
When crating your account both ways, you must select an instance your
|
|
|
account will belong to by default. The instance you will select will be
|
|
|
most likely your university or other organization you are a member of.
|
|
|
|
|
|
#### How to get into ReCodEx?
|
|
|
|
|
|
To log in, go to the homepage of ReCodEx and in the left sidebar choose
|
|
|
the menu item “*Sign in*”. Then you must enter your credentials into one
|
|
|
of the two forms – if you selected a password during registration, then
|
|
|
you should sign with your email and password in the first form called
|
|
|
“*Sign into ReCodEx*”. If you registered using the Charles University
|
|
|
Authentication Service (CAS), you should put your student’s number and
|
|
|
your CAS password into the second form called “Sign into ReCodEx using
|
|
|
CAS UK”.
|
|
|
|
|
|
#### How do I sign out of ReCodEx?
|
|
|
|
|
|
If you don’t use ReCodEx for a whole day, you will be logged out
|
|
|
automatically. However, we recommend you sign out of the application
|
|
|
after you finished your interaction with it. The logout button is placed
|
|
|
in the top section of the left sidebar right under your name. You will
|
|
|
have to expand the sidebar with a button next to the “*ReCodEx*” title
|
|
|
(shown in the picture below).
|
|
|
|
|
|
@todo: Simon's image
|
|
|
|
|
|
#### What to do when you cannot remember your password?
|
|
|
|
|
|
If you can’t remember your password and you don’t use CAS UK
|
|
|
authentication, then you can reset your password. You will find a link
|
|
|
saying “*You cannot remember what your password was? Reset your
|
|
|
password.*” under the sign in form. After you click on this link, you
|
|
|
will be asked to submit your email address. An email with a link
|
|
|
containing a special token will be sent to the address you fill in. We
|
|
|
make sure that the person who requested password resetting is really
|
|
|
you. When you click on the link (or you copy & paste it into your web
|
|
|
browser) you will be able to select a new password for your account. The
|
|
|
token is valid only for a couple of minutes, so do not forget to reset
|
|
|
the password as soon as possible, or you will have to request a new link
|
|
|
with a valid token.
|
|
|
|
|
|
If you sign in through CAS UK, then please follow the instructions
|
|
|
provided by the administrators of the service described on their
|
|
|
website.
|
|
|
|
|
|
#### How to configure your account?
|
|
|
|
|
|
There are several options you have to edit your user account.
|
|
|
|
|
|
- changing your personal information (i.e., name)
|
|
|
- changing your credentials (email and password)
|
|
|
- updating your preferences (e.g., source code viewer/editor settings,
|
|
|
default language)
|
|
|
|
|
|
You can access the settings page through the “*Settings*” button right
|
|
|
under your name in the left sidebar.
|
|
|
|
|
|
|
|
|
### Scenario \#2: User is a student
|
|
|
|
|
|
@todo: describe what it means to be a “student” and what are the
|
|
|
student’s rights
|
|
|
|
|
|
#### How to join a group for my class?
|
|
|
|
|
|
@todo: How to join a specific group
|
|
|
|
|
|
#### Which assignments do I have to solve?
|
|
|
|
|
|
@todo: Where the student can find the list of the assignment he is
|
|
|
expected to solve, what is the first and second deadline.
|
|
|
|
|
|
#### Where can I see details of my classes’ group?
|
|
|
|
|
|
@todo: Where can the user see groups description and details, what
|
|
|
information is available.
|
|
|
|
|
|
#### How to submit a solution of an assignment?
|
|
|
|
|
|
@todo: How does a student submit his solution through the web app
|
|
|
|
|
|
#### Where are the results of my solutions?
|
|
|
|
|
|
@todo: When the results are ready and what the results mean and what to
|
|
|
do about them, when the user is convinced, that his solution is correct
|
|
|
although the results say different
|
|
|
|
|
|
#### How can I discuss my solution with my teacher/group’s supervisor directly through the web application?
|
|
|
|
|
|
@todo: Describe the comments thread behavior (public/private comments),
|
|
|
who else can see the comments, how notifications work (*not implemented
|
|
|
yet*!).
|
|
|
|
|
|
|
|
|
### Scenario \#3: User is supervisor of a group
|
|
|
|
|
|
@todo: describe what it means to be a “supervisor” of a group and what
|
|
|
are the supervisors rights
|
|
|
|
|
|
#### How do I become a supervisor of a group?
|
|
|
|
|
|
@todo: How does a user become a supervisor of a group?
|
|
|
|
|
|
#### How to add or remove a student to my group?
|
|
|
|
|
|
@todo: How to add a specific student to a given group
|
|
|
|
|
|
#### How do I add another supervisor to my group?
|
|
|
|
|
|
@todo: who can add another supervisor, what would be the rights of the
|
|
|
second supervisor
|
|
|
|
|
|
#### How do I create a subgroup of my group?
|
|
|
|
|
|
@todo: What it means to create a subgroup and how to do it.
|
|
|
|
|
|
#### How do I assign an exercise to my students?
|
|
|
|
|
|
@todo: Describe how to access the database of the exercises and what are
|
|
|
the possibilities of assignment setup – availability, deadlines, points,
|
|
|
score configuration, limits
|
|
|
|
|
|
#### How do I configure the limits of an assignment and how to choose appropriate limits?
|
|
|
|
|
|
@todo: Describe the form and explain the concept of reference solutions.
|
|
|
How to evaluate the reference solutions for the exercise right now (to
|
|
|
get the up-to-date information).
|
|
|
|
|
|
#### How can I assign some exercises only to some students of the group?
|
|
|
|
|
|
@todo: Describe how to achieve this using subgroups
|
|
|
|
|
|
#### How can I see my students’ solutions?
|
|
|
|
|
|
@todo Describe where all the students’ solutions for a given assignment
|
|
|
can be found, where to look for all solutions of a given student, how to
|
|
|
see results of a specific student’s solution’s evaluation result.
|
|
|
|
|
|
#### Can I assign points to my students’ solutions manually instead of depending on automatic scoring?
|
|
|
|
|
|
@todo If and how to change the score of a solution – assignment
|
|
|
settings, setting points, bonus points, accepting a solution (*not
|
|
|
implemented yet!*). Describe how the student and supervisor will still
|
|
|
be able to see the percentage received from the automatic scoring, but
|
|
|
the awarded points will be overridden.
|
|
|
|
|
|
#### How can I discuss student’s solution with him/her directly through the web application?
|
|
|
|
|
|
@todo: Describe the comments thread behavior (public/private comments),
|
|
|
who else can see the comments -- same as from the student perspective
|
|
|
|
|
|
|
|
|
### Writing job configuration
|
|
|
|
|
|
To run and evaluate an exercise the backend needs to know the steps how to do
|
|
|
that. This is different for each environment (operation system, programming
|
|
|
language, etc.), so each of the environments needs to have separate
|
|
|
configuration.
|
|
|
|
|
|
Backend works with a powerful, but quite low level description of simple
|
|
|
connected tasks written in YAML syntax. More about the syntax and general task
|
|
|
overview can be found on [separate
|
|
|
page](https://github.com/ReCodEx/wiki/wiki/Assignments). One of the planned
|
|
|
features was user friendly configuration editor, but due to tight deadline and
|
|
|
team composition it did not make it to the first release. However, writing
|
|
|
configuration in the basic format will be always available and allows users to
|
|
|
use the full expressive power of the system.
|
|
|
|
|
|
This section walks through creation of job configuration for _hello world_
|
|
|
exercise. The goal is to compile file _source.c_ and check if it prints `Hello
|
|
|
World!` to the standard output. This is the only test case, let's call it
|
|
|
**A**.
|
|
|
|
|
|
The problem can be split into several tasks:
|
|
|
|
|
|
- compile _source.c_ into _helloworld_ with `/usr/bin/gcc`
|
|
|
- run _helloworld_ and save standard output into _out.txt_
|
|
|
- fetch predefined output (suppose it is already uploaded to fileserver) with
|
|
|
hash `a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b` to _reference.txt_
|
|
|
- compare _out.txt_ and _reference.txt_ by `/usr/bin/diff`
|
|
|
|
|
|
The absolute path of tools can be obtained from system administrator. However,
|
|
|
`/usr/bin/gcc` is location, where the GCC binary is available almost everywhere,
|
|
|
so location of some tools can be (professionally) guessed.
|
|
|
|
|
|
First, write header of the job to the configuration file.
|
|
|
|
|
|
```{.yml}
|
|
|
submission:
|
|
|
job-id: hello-word-job
|
|
|
hw-groups:
|
|
|
- group1
|
|
|
```
|
|
|
|
|
|
Basically it means, that the job _hello-world-job_ needs to be run on workers
|
|
|
that belong to the `group_1` hardware group . Reference files are downloaded
|
|
|
from the default location configured in API (such as
|
|
|
`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job
|
|
|
execution log will not be saved to result archive.
|
|
|
|
|
|
Next the tasks have to be constructed under _tasks_ section. In this demo job,
|
|
|
every task depends only on previous one. The first task has input file
|
|
|
_source.c_ (if submitted by user) already available in working directory, so
|
|
|
just call the GCC. Compilation is run in sandbox as any other external program
|
|
|
and should have relaxed time and memory limits. In this scenario, worker
|
|
|
defaults are used. If compilation fails, the whole job is immediately terminated
|
|
|
(because the _fatal-failure_ bit is set). Because _bound-directories_ option in
|
|
|
sandbox limits section is mostly shared between all tasks, it can be set in
|
|
|
worker configuration instead of job configuration (suppose this for following
|
|
|
tasks). For configuration of workers please contact your administrator.
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "compilation"
|
|
|
type: "initiation"
|
|
|
fatal-failure: true
|
|
|
cmd:
|
|
|
bin: "/usr/bin/gcc"
|
|
|
args:
|
|
|
- "source.c"
|
|
|
- "-o"
|
|
|
- "helloworld"
|
|
|
sandbox:
|
|
|
name: "isolate"
|
|
|
limits:
|
|
|
- hw-group-id: group1
|
|
|
chdir: ${EVAL_DIR}
|
|
|
bound-directories:
|
|
|
- src: ${SOURCE_DIR}
|
|
|
dst: ${EVAL_DIR}
|
|
|
mode: RW
|
|
|
```
|
|
|
|
|
|
The compiled program is executed with time and memory limit set and the standard
|
|
|
output is redirected to a file. This task depends on _compilation_ task, because
|
|
|
the program cannot be executed without being compiled first. It is important to
|
|
|
mark this task with _execution_ type, so exceeded limits will be reported in
|
|
|
frontend.
|
|
|
|
|
|
Time and memory limits set directly for a task have higher priority than worker
|
|
|
defaults. One important constraint is, that these limits cannot exceed limits
|
|
|
set by workers. Worker defaults are present as a safety measure so that a
|
|
|
malformed job configuration cannot block the worker forever. Worker default
|
|
|
limits should be reasonably high, like a gigabyte of memory and several hours of
|
|
|
execution time. For exact numbers please contact your administrator.
|
|
|
|
|
|
It is important to know that if the output of a program (both standard and
|
|
|
error) is redirected to a file, the sandbox disk quotas apply to that file, as
|
|
|
well as the files created directly by the program. In case the outputs are
|
|
|
ignored, they are redirected to `/dev/null`, which means there is no limit on
|
|
|
the output length (as long as the printing fits in the time limit).
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "execution_1"
|
|
|
test-id: "A"
|
|
|
type: "execution"
|
|
|
dependencies:
|
|
|
- compilation
|
|
|
cmd:
|
|
|
bin: "helloworld"
|
|
|
sandbox:
|
|
|
name: "isolate"
|
|
|
stdout: ${EVAL_DIR}/out.txt
|
|
|
limits:
|
|
|
- hw-group-id: group1
|
|
|
chdir: ${EVAL_DIR}
|
|
|
time: 0.5
|
|
|
memory: 8192
|
|
|
```
|
|
|
|
|
|
Fetch sample solution from file server. Base URL of file server is in the header
|
|
|
of the job configuration, so only the name of required file (its `sha1sum` in
|
|
|
our case) is necessary.
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "fetch_solution_1"
|
|
|
test-id: "A"
|
|
|
dependencies:
|
|
|
- execution
|
|
|
cmd:
|
|
|
bin: "fetch"
|
|
|
args:
|
|
|
- "a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"
|
|
|
- "${SOURCE_DIR}/reference.txt"
|
|
|
```
|
|
|
|
|
|
Comparison of results is quite straightforward. It is important to set the task
|
|
|
type to _evaluation_, so that the return code is set to 0 if the program is
|
|
|
correct and 1 otherwise. We do not set our own limits, so the default limits are
|
|
|
used.
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "judge_1"
|
|
|
test-id: "A"
|
|
|
type: "evaluation"
|
|
|
dependencies:
|
|
|
- fetch_solution_1
|
|
|
cmd:
|
|
|
bin: "/usr/bin/diff"
|
|
|
args:
|
|
|
- "out.txt"
|
|
|
- "reference.txt"
|
|
|
sandbox:
|
|
|
name: "isolate"
|
|
|
limits:
|
|
|
- hw-group-id: group1
|
|
|
chdir: ${EVAL_DIR}
|
|
|
```
|
|
|
<!---
|
|
|
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
|
|
|
-->
|
|
|
|