|
|
<!---
|
|
|
Notes:
|
|
|
|
|
|
* Dvoustrankovy uvod - co by to melo umet
|
|
|
* Analýza - co se rozhodneme delat, jak by se to dalo delat, pridelit dulezitost
|
|
|
- pak se da odkazat na to, proc jsme co nestihli, zahrnout i advanced featury
|
|
|
- odkazovat se u featur, ze to je v planu v pristich verzi - co je dulezite
|
|
|
a co ne!! Zduvodnit tim, jakou podmnozinu featur nechat, snaze se pak bude
|
|
|
popisovat architektura
|
|
|
* V analyze vysvetlit architekturu
|
|
|
* Related works nechat jako samostatnou kapitolu
|
|
|
* Poradi - pozadavky -> related works -> analyza
|
|
|
* Provazani komponent musi rozumet administrator a tvurce ulohy - obecna
|
|
|
kapitola v analyze - puvodni kapitola o analyze byla povedena, jen se tam
|
|
|
micha seznam zprav nebo co - to nezajima vsechny
|
|
|
* Po obecnym uvodu - rozdelit podle potencialniho ctenare - uzivatel ucitel, pak
|
|
|
uzivatel admin
|
|
|
* Instalacni dokumentace stranou, jako posledni
|
|
|
* Uzivatelaka dokumentace - admin: popis prav, autor uloh: nejobsahlejsi, format
|
|
|
skriptu - ale formulovat tak, ze bude popis na co kde kliknout, jazyk popsat
|
|
|
separatne - v budoucnu to bude irelevantni, je potreba daleko hloubeji - je
|
|
|
treba popsat detailne co eelaji, i treba relativni/absolutni adresy, makra,
|
|
|
kde vidi prekladac knihovny a headery... - kapitola na konci
|
|
|
* Uzivatelska dokumentace pro studenta: vysvetleni
|
|
|
* Jak se boduje uloha - tezko rict, kam to patri - nekde na zacatku? Ale zajima
|
|
|
to vsechny role, ucitel musi vedet, jak to nakonfigurovat - zminit treba i jak
|
|
|
bodovat podle casu a pameti (v analyze nebo v uvodu) - vice vystupu od judge,
|
|
|
interpolace bodu podle vyuziti pameti... je to spis mimo uživatelskou
|
|
|
* Nepsat kde na jake tlacitko kliknout
|
|
|
* Tutorialy - scenare, co udelat kdyz chci neco, vzorove pruchody
|
|
|
* U formularu je nejlepsi kdyz zadna dokumentace neni, doplnit popisky k polim
|
|
|
formularu
|
|
|
* V dokumentaci popsat konfigy nekde separatne - skore, yaml - referencni
|
|
|
dokumentace
|
|
|
* Urcite ne FAQ, vic strukturovane
|
|
|
* Instalaci dohromady na konec
|
|
|
* Programatorska dokumentace - "nejmene ctenaru" - neco uz tam mame, neni to
|
|
|
treba davat do tistene dokumentace - do tistene dokumentace dat odkaz na wiki,
|
|
|
neco v tistene ale byt musi - jaky jazyk, designové rozhodnutí - zdůvodnění
|
|
|
nedávat do úvodní analýzy - k referencnim dokumentacim udelat uvod - "restove
|
|
|
API jsme pojali timto zpusobem, deli se to na tyto skupiny, ..."
|
|
|
* Co zvolena architektura znamena, neco to ma dat i uzivateli, ktery
|
|
|
architekturu nezna, kde je drzenej stav
|
|
|
* Z dokumentace musi byt patrne, co dela knihovna a co se musi udelat rucne -
|
|
|
kolik je to prace - psat to vic pro uzivatele, ktery zna technologie, nezna
|
|
|
knihovny
|
|
|
* Mit soucit s tema, ktery to toho tolik neznaji - jak technologie, tak
|
|
|
architekturu a system CodExu
|
|
|
* Nesedi cisla stranek
|
|
|
* Stazeni ZIPu s vystupy Backendu - roztridit na verejne a tajne, verejne i pro
|
|
|
studenta
|
|
|
-->
|
|
|
|
|
|
# Introduction
|
|
|
|
|
|
Generally, there are many different ways and opinions on how to teach people
|
|
|
something new. However, most people agree that a hands-on experience is one of
|
|
|
the best ways to make the human brain remember a new skill. Learning must be
|
|
|
entertaining and interactive, with fast and frequent feedback. Some kinds of
|
|
|
knowledge are more suitable for this practical type of learning than others, and
|
|
|
fortunately, programming is one of them.
|
|
|
|
|
|
University education system is one of the areas where this knowledge can be
|
|
|
applied. In computer programming, there are several requirements a program
|
|
|
should satisfy, such as the code being syntactically correct, efficient and easy
|
|
|
to read, maintain and extend.
|
|
|
|
|
|
Checking programs written by students takes time and requires a lot of
|
|
|
mechanical, repetitive work -- reviewing source codes, compiling them and
|
|
|
running them through testing scenarios. It is therefore desirable to automate as
|
|
|
much of this work as possible. The first idea of an automatic evaluation system
|
|
|
comes from Stanford University professors in 1965. They implemented a system
|
|
|
which evaluated code in Algol submitted on punch cards. In following years, many
|
|
|
similar products were written.
|
|
|
|
|
|
In today's world, properties like correctness and efficiency can be tested
|
|
|
automatically to a large extent. This fact should be exploited to help teachers
|
|
|
save time for tasks such as examining bad design, bad coding habits and logical
|
|
|
mistakes, which are difficult to perform automatically.
|
|
|
|
|
|
There are two basic ways of automatically evaluating code -- statically
|
|
|
(checking the source code without running it; safe, but not very precise) or
|
|
|
dynamically (running the code on test inputs and checking the correctness of
|
|
|
outputs ones; provides good real world experience, but requires extensive
|
|
|
security measures).
|
|
|
|
|
|
This project focuses on the machine-controlled part of source code evaluation.
|
|
|
First, general concepts of grading systems are observed, new requirements are
|
|
|
specified and project with similar functionality are examined. Also, problems of
|
|
|
the software previously used at Charles University in Prague are briefly
|
|
|
discussed. With acquired knowledge from such projects in production, we set up
|
|
|
goals for the new evaluation system, designed the architecture and implemented a
|
|
|
fully operational solution based on dynamic evaluation. The system is now ready
|
|
|
for production testing at the university.
|
|
|
|
|
|
## Assignment
|
|
|
|
|
|
The major goal of this project is to create a grading application that will be
|
|
|
used for programming classes at the Faculty of Mathematics and Physics of the
|
|
|
Charles University in Prague. However, the application should be designed in a
|
|
|
modular fashion to be easily extended or even modified to make other ways of
|
|
|
usage possible.
|
|
|
|
|
|
The system should be capable of dynamic analysis of submitted source codes. This
|
|
|
consists of following basic steps:
|
|
|
|
|
|
1. compile the code and check for compilation errors
|
|
|
2. run compiled binary in a sandbox with predefined inputs
|
|
|
3. check constraints on used amount of memory and time
|
|
|
4. compare program outputs with predefined values
|
|
|
5. award the code with a numeric score
|
|
|
|
|
|
The project has a great starting point -- there is an old grading system
|
|
|
currently used at the university (CodEx), so its flaws and weaknesses can be
|
|
|
addressed. Furthermore, many teachers desire to use and test the new system and
|
|
|
they are willing to consult ideas or problems during development with us.
|
|
|
|
|
|
## Current system
|
|
|
|
|
|
Currently used grading solution at the Faculty of Mathematics and Physics of
|
|
|
the Charles University in Prague which was implemented in 2006 by a group
|
|
|
of students. It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
|
|
|
and it has been used with some improvements since then. The original plan was
|
|
|
to use the system only for basic programming courses, but there was a demand
|
|
|
for adapting it for many different subjects.
|
|
|
|
|
|
CodEx is based on dynamic analysis. It features a web-based interface, where
|
|
|
supervisors can assign exercises to their students and the students have a time
|
|
|
window to submit their solutions. Each solution is compiled and run in sandbox
|
|
|
(MO-Eval). The metrics which are checked are: correctness of the output, time
|
|
|
and memory limits. It supports programs written in C, C++, C#, Java, Pascal,
|
|
|
Python and Haskell.
|
|
|
|
|
|
The whole system is intended to help both teachers (supervisors) and students.
|
|
|
To achieve this, it is crucial to keep in mind the typical usage scenarios of
|
|
|
the system and to try to make these tasks as simple as possible.
|
|
|
|
|
|
The system has a database of users. Each user is assigned a role, which
|
|
|
corresponds to his/her privileges. There are user groups reflecting the
|
|
|
structure of lectured courses.
|
|
|
|
|
|
A database of exercises (algorithmic problems) is another part of the project.
|
|
|
Each exercise consists of a text describing the problem in multiple language
|
|
|
variants, an evaluation configuration (machine-readable instructions on how to
|
|
|
evaluate solutions to the exercise) and a set of inputs and reference outputs.
|
|
|
Exercises are created by instructed privileged users. Assigning an exercise to a
|
|
|
group means choosing one of the available exercises and specifying additional
|
|
|
properties: a deadline (optionally a second deadline), a maximum amount of
|
|
|
points, a configuration for calculating the score, a maximum number of
|
|
|
submissions, and a list of supported runtime environments (e.g. programming
|
|
|
languages) including specific time and memory limits for each one.
|
|
|
|
|
|
Typical use cases for supported user roles are following:
|
|
|
|
|
|
- **student**
|
|
|
- join a group
|
|
|
- get assignments in group
|
|
|
- submit solution to assignment
|
|
|
- view solution results
|
|
|
- **supervisor**
|
|
|
- create exercise
|
|
|
- assign exercise to group, modify assignment
|
|
|
- view all results in group
|
|
|
- alter automatic solution grading
|
|
|
- **administrator**
|
|
|
- create groups
|
|
|
- alter user privileges
|
|
|
- check system logs, upgrades and other management
|
|
|
|
|
|
Current system is old, but robust. There were no major security incidents
|
|
|
during its production usage. However, from today's perspective there are
|
|
|
several drawbacks. The main ones are:
|
|
|
|
|
|
- **web interface** -- The web interface is simple and fully functional. But
|
|
|
rapid development in web technologies opens new horizons of how web interface
|
|
|
can be made.
|
|
|
- **web API** -- CodEx offers a very limited XML API based on outdated
|
|
|
technologies that is not sufficient for users who would like to create custom
|
|
|
interfaces such as a command line tool or mobile application.
|
|
|
- **sandboxing** -- MO-Eval sandbox is based on principle of monitoring system
|
|
|
calls and blocking the bad ones. This can be easily done for single-threaded
|
|
|
applications, but proves difficult with multi-threaded ones. In present day,
|
|
|
parallelism is a very important area of computing, so there is requirement to
|
|
|
test multi-threaded applications too.
|
|
|
- **instances** -- Different ways of CodEx usage scenarios requires separate
|
|
|
instances (Programming I and II, Java, C#, etc.). This configuration is not
|
|
|
user friendly (students have to register in each instance separately) and
|
|
|
burdens administrators with unnecessary work. CodEx architecture does not
|
|
|
allow sharing hardware between instances, which results in an inefficient use
|
|
|
of hardware for evaluation.
|
|
|
- **task extensibility** -- There is a need to test and evaluate complicated
|
|
|
programs for classes such as Parallel programming or Compiler principles,
|
|
|
which have a more difficult evaluation chain than simple
|
|
|
compilation/execution/evaluation provided by CodEx.
|
|
|
|
|
|
### Exercise evaluation chain
|
|
|
|
|
|
The most important part of the system is evaluation of solutions submitted by
|
|
|
students. Concepts of consecutive steps from source code to final results
|
|
|
is described in more detail below to give readers solid overview of what have to
|
|
|
happen during evaluation process.
|
|
|
|
|
|
First thing users have to do is to submit their solutions through some user
|
|
|
interface. Then, the system checks assignment invariants (deadlines, count of
|
|
|
submissions, ...) and stores submitted files. The runtime environment is
|
|
|
automatically detected based on input files and a suitable evaluation
|
|
|
configuration variant is chosen (one exercise can have multiple variants, for
|
|
|
example C and Java languages). This exercise configuration is then used for
|
|
|
taking care of evaluation process.
|
|
|
|
|
|
There is a pool of worker computers dedicated to evaluation jobs. Each one of
|
|
|
them can support different environments and programming languages to allow
|
|
|
testing programs for as many platforms as possible. Incoming jobs are scheduled
|
|
|
to a worker that is capable of running the job.
|
|
|
|
|
|
The worker obtains the solution and its evaluation configuration, parses it and
|
|
|
starts executing the contained instructions. It is crucial to keep the worker
|
|
|
computer secure and stable, so a sandboxed environment is used for dealing with
|
|
|
unknown source code. When the execution is finished, results are saved and the
|
|
|
submitter is notified.
|
|
|
|
|
|
The output of the worker contains data about the evaluation, such as time and
|
|
|
memory spent on running the program for each test input and whether its output
|
|
|
was correct. The system then calculates a numeric score from this data, which is
|
|
|
presented to the student. If the solution is wrong (incorrect output, uses too
|
|
|
much memory,..), error messages are also displayed to the submitter.
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
There are many different formal requirements for the system. Some of them
|
|
|
are necessary for any system for source code evaluation, some of them are
|
|
|
specific for university deployment and some of them arose during the ten year
|
|
|
long lifetime of the old system. There are not many ways to improve CodEx
|
|
|
experience from the perspective of a student, but a lot of feature requests come
|
|
|
from administrators and supervisors. The ideas were gathered mostly from our
|
|
|
personal experience with the system and from meetings with faculty staff
|
|
|
involved with the current system.
|
|
|
|
|
|
In general, CodEx features should be preserved, so only differences are
|
|
|
presented here. For clear arrangement all the requirements and wishes are
|
|
|
presented grouped by categories.
|
|
|
|
|
|
### System features
|
|
|
|
|
|
System features represents directly accessible functionality to users of the
|
|
|
system. They describe the evaluation system in general and also university
|
|
|
addons (mostly administrative features).
|
|
|
|
|
|
#### End user requirements
|
|
|
|
|
|
- group hierarchy -- @todo: copy text from
|
|
|
[here](https://github.com/ReCodEx/wiki/wiki/Rewritten-docs/87e4bcd39a4fca3eadbb4748e9a3b6ced2bd7150#intended-usage)
|
|
|
- there is a database of exercises; teachers can create exercises including
|
|
|
textual description, sample inputs and correct reference outputs (for example
|
|
|
"sum all numbers from given file and write the result to the standard output")
|
|
|
- teachers can specify way of computation grading points which will be awarded
|
|
|
to the students depending on the quality of his/her solution for each
|
|
|
assignment extra
|
|
|
- teachers can view detailed data about their students (users of a their groups)
|
|
|
including all submitted solutions; also, each of the solution can be manually
|
|
|
reviewed, commented and assigned additional points (positive or negative)
|
|
|
- one particular solution can be marked as accepted (used for grading this
|
|
|
assignment); otherwise, the solution with most points is used
|
|
|
- teacher can edit student solution and privately resubmit it; optionally saving
|
|
|
all results (including temporary ones)
|
|
|
- localization of all texts (UI and exercises)
|
|
|
- Markdown support for creating exercise texts
|
|
|
- tagging exercises in database and search by these tags
|
|
|
- comments of exercises, tests and solutions
|
|
|
- plagiarism detection
|
|
|
|
|
|
#### Administrative requirements
|
|
|
|
|
|
- users can use an intuitive user interface for interaction with the system,
|
|
|
mainly for viewing assigned exercises, uploading their own solutions to the
|
|
|
assignments, and viewing the results of the solutions after an automatic
|
|
|
evaluation is finished; the two wanted interfaces are web and command-line
|
|
|
based
|
|
|
- user privilege separation (at least two roles -- _student_ and _supervisor_)
|
|
|
- logging in through a university authentication system (e.g. LDAP)
|
|
|
- SIS (university information system) integration for fetching personal user
|
|
|
data
|
|
|
- safe environment in which the students' solutions are executed
|
|
|
- support for multiple programming environments at once to avoid unacceptable
|
|
|
workload for administrator (maintain separate installation for every course)
|
|
|
and high hardware occupation
|
|
|
- advanced low-level evaluation flow configuration with high-level abstraction
|
|
|
layer for ordinary configuration cases; the configuration should be able to
|
|
|
express more complicated flows than just compiling a source code and running
|
|
|
the program against test inputs -- for example, some exercises need to build
|
|
|
the source code with a tool, run some tests, then run the program through
|
|
|
another tool and perform additional tests
|
|
|
- use of modern technologies with state-of-the-art compilers
|
|
|
|
|
|
### Nonfunctional requirements
|
|
|
|
|
|
Nonfunctional requirements are requirements of technical character with no
|
|
|
direct mapping to visible parts of system. In ideal word, users should not know
|
|
|
about these if they work properly, but would be at least annoyed if these
|
|
|
requirements were not met. Most notably they are these ones:
|
|
|
|
|
|
- user interface of the system accessible on users' computers without
|
|
|
installation of any kind of additional software
|
|
|
- easy implementation of different user interfaces
|
|
|
- be ready for workload hundreds of students and tens of supervisors
|
|
|
- automated installation of all components
|
|
|
- source code with permissive license allowing further development; this also
|
|
|
applies on used libraries and frameworks
|
|
|
- multi-platform worker supporting at least two major operating systems
|
|
|
|
|
|
### Conclusion
|
|
|
|
|
|
The survey shows that there are a lot of different requirements and wishes for
|
|
|
the new system. When the system is ready it is likely that there will be new
|
|
|
ideas of how to use the system and thus the system must be designed to be easily
|
|
|
extendable, so everyone can develop their own feature. This also means that
|
|
|
widely used programming languages and techniques should be used, so users can
|
|
|
quickly understand the code and make changes.
|
|
|
|
|
|
To find out the current state in the field of automatic grading systems we did a
|
|
|
short market survey on the field of automatic grading systems at universities,
|
|
|
programming contests, and possibly other places where similar tools are
|
|
|
available.
|
|
|
|
|
|
|
|
|
## Related work
|
|
|
|
|
|
This is not a complete list of available evaluators, but only a few projects
|
|
|
which are used these days and can be an inspiration for our project. Each
|
|
|
project from the list has a brief description and some key features mentioned.
|
|
|
|
|
|
### Progtest
|
|
|
|
|
|
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT
|
|
|
ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
|
|
|
Bash programming and knowledge-based quizzes. There are several bonus points
|
|
|
and penalties and also a few hints what is failing in the submitted solution. It
|
|
|
is very strict on source code quality, for example `-pedantic` option of GCC,
|
|
|
Valgrind for memory leaks or array boundaries checks via `mudflap` library.
|
|
|
|
|
|
### Codility
|
|
|
|
|
|
[Codility](https://codility.com/) is a web based solution primary targeted to
|
|
|
company recruiters. It is a commercial product available as a SaaS and it
|
|
|
supports 16 programming languages. The
|
|
|
[UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png)
|
|
|
of Codility is [opensource](https://github.com/Codility/cui), the rest of source
|
|
|
code is not available. One interesting feature is 'task timeline' -- captured
|
|
|
progress of writing code for each user.
|
|
|
|
|
|
### CMS
|
|
|
|
|
|
[CMS](http://cms-dev.github.io/index.html) is an opensource distributed system
|
|
|
for running and organizing programming contests. It is written in Python and
|
|
|
contains several modules. CMS supports C/C++, Pascal, Python, PHP, and Java
|
|
|
programming languages. PostgreSQL is a single point of failure, all modules
|
|
|
heavily depend on the database connection. Task evaluation can be only a three
|
|
|
step pipeline -- compilation, execution, evaluation. Execution is performed in
|
|
|
[Isolate](https://github.com/ioi/isolate), sandbox written by the consultant
|
|
|
of our project, Mgr. Martin Mareš, Ph.D.
|
|
|
|
|
|
### MOE
|
|
|
|
|
|
[MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C
|
|
|
and Python. It does not provide a default GUI interface, all actions have to be
|
|
|
performed from command line. The system does not evaluate submissions in real
|
|
|
time, results are computed in batch mode after exercise deadline, using Isolate
|
|
|
for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but
|
|
|
the system is generally obsolete.
|
|
|
|
|
|
### Kattis
|
|
|
|
|
|
[Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean
|
|
|
and functional web UI, but the rest of the application is too simple. A nice
|
|
|
feature is the usage of a [standardized
|
|
|
format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for
|
|
|
exercises. Kattis is primarily used by programming contest organizers, company
|
|
|
recruiters and also some universities.
|
|
|
|
|
|
|
|
|
# Analysis
|
|
|
|
|
|
None of the existing projects we came across fulfills all the requested features
|
|
|
for the new system. There is no grading system which supports arbitrary-length
|
|
|
evaluation pipeline, so we have to implement this feature ourselves, cautiously
|
|
|
treading through unexplored fields. Also, no existing solution is extensible
|
|
|
enough to be used as a base for the new system. After considering all these
|
|
|
facts, it is clear that a new system has to be written from scratch. This
|
|
|
implies that only a subset of all the features will be implemented in the first
|
|
|
version, the others coming in the following releases.
|
|
|
|
|
|
Gathered features are categorized based on priorities for the whole system. The
|
|
|
highest priority has main functionality similar to current CodEx. It is a base
|
|
|
line to be useful in production environment, but a new design allows to easily
|
|
|
develop further. On top of that, most of ideas from faculty staff belongs to
|
|
|
second priority bucket, which will be implemented as part of the project. The
|
|
|
most complicated tasks from this category are advanced low-level evaluation
|
|
|
configuration format, using modern tools, connecting to a university systems and
|
|
|
merging separate system instances into single one. Other tasks are scheduled for
|
|
|
next releases after successful project defense. Namely, these are high-level
|
|
|
exercise evaluation configuration with user-friendly interface for common
|
|
|
exercise types, SIS integration (when some API will be available from their
|
|
|
side) and command-line submit tool. Plagiarism detection is not likely to be
|
|
|
part of any release in near future unless someone other makes the engine. The
|
|
|
detection problem is too hard to be solved as part of this project.
|
|
|
|
|
|
We named the new project **ReCodEx -- ReCodEx Code Examiner**. The name
|
|
|
should point to the old CodEx, but also reflect the new approach to solve
|
|
|
issues. **Re** as part of the name means redesigned, rewritten, renewed, or
|
|
|
restarted.
|
|
|
|
|
|
At this point there is a clear idea how the new system will be used and what are
|
|
|
the major enhancements for future releases. With this in mind, the overall
|
|
|
architecture can be sketched. From the previous research, several goals are set
|
|
|
up for the new project. They mostly reflect drawbacks of the current version of
|
|
|
CodEx and some reasonable wishes of university users. Most notable features are
|
|
|
following:
|
|
|
|
|
|
- modern HTML5 web frontend written in JavaScript using a suitable framework
|
|
|
- REST API implemented in PHP, communicating with database, evaluation backend
|
|
|
and a file server
|
|
|
- evaluation backend implemented as a distributed system on top of a message
|
|
|
queue framework (ZeroMQ) with master-worker architecture
|
|
|
- multi-platform worker supporting Linux and Windows environment (latter
|
|
|
without sandbox, no general purpose suitable tool available yet)
|
|
|
- evaluation procedure configured in a YAML file, compound of small tasks
|
|
|
connected into an arbitrary oriented acyclic graph
|
|
|
|
|
|
The reasons supporting these decisions are explained in the rest of analysis
|
|
|
chapter. Also a lot of smaller design choices are mentioned including possible
|
|
|
options, what is picked to implement and why. But first, discuss basic concepts
|
|
|
of the system.
|
|
|
|
|
|
## Basic concepts
|
|
|
|
|
|
The system is designed as a web application. The requirements say that the user
|
|
|
interface must be accessible from students' computers without the need to
|
|
|
install additional software. This immediately implies that users have to be
|
|
|
connected to the internet, so it is used as communication medium. Today, there
|
|
|
are two main ways of designing graphical user interface -- as a native
|
|
|
application or a web page. Creating a nice and multi-platform application with
|
|
|
graphical interface is almost impossible because of the large number of
|
|
|
different environments. Also, these applications often requires installation or
|
|
|
at least downloading its files (sources or binaries). On the other hand,
|
|
|
distributing a web application is easier, because every personal computer has an
|
|
|
internet browser installed. Also, browsers support an (mostly) unified and
|
|
|
standardized environment of HTML5 and JavaScript. CodEx is also a web
|
|
|
application and everybody seems satisfied with it. There are other communicating
|
|
|
channels most programmers have available, such as e-mail or git, but they are
|
|
|
inappropriate for designing user interfaces on top of them.
|
|
|
|
|
|
The application interacts with users. From the project assignment it is clear,
|
|
|
that the system has to keep personalized data about users and adapt presented
|
|
|
content according to this knowledge. User data cannot be publicly visible, so
|
|
|
that implies necessity of user authentication. The application also has to
|
|
|
support multiple ways of authentication (university authentication systems, a
|
|
|
company LDAP server, an OAuth server...) and permit adding more security
|
|
|
measures in the future, such as two-factor authentication.
|
|
|
|
|
|
User data also includes a privilege level. From the assignment it is required to
|
|
|
have at least two roles, _student_ and _supervisor_. However, it is wise to add
|
|
|
_administrator_ level, which takes care of the system as a whole and is
|
|
|
responsible for core setup, monitoring, updates and so on. Student role has the
|
|
|
least power, basically can just view assignments and submit solutions.
|
|
|
Supervisors have more authority, so they can create exercises and assignments,
|
|
|
view results of students etc. From the university organization, one possible
|
|
|
level could be introduced, _course guarantor_. However, from real experience all
|
|
|
duties related with lecturing of labs are already associated with supervisors,
|
|
|
so this role seems not so useful. In addition, no one requested more than three
|
|
|
level privilege scheme.
|
|
|
|
|
|
School labs are lessons for some students lead by supervisors. Students have the
|
|
|
same homework and supervisors are evaluating its solutions. This organization
|
|
|
has to be carried into the new system. Counterpart to real labs are virtual
|
|
|
groups. This concept was already discussed in previous chapter including need
|
|
|
for hierarchical structure of groups. Right for attending labs has only a
|
|
|
person, who is student of the university and is recorded in university
|
|
|
information system. To allow restriction of group members in ReCodEx, there two
|
|
|
type of groups -- _public_ and _private_. Public groups are open for every
|
|
|
registered users, but to become a member of private group one of its supervisors
|
|
|
have to add that user. This could be done automatically at beginning of the term
|
|
|
with data from information system, but unfortunately there is no such API yet.
|
|
|
However, creating this API is now considered by university leadership. Another
|
|
|
just as good solution for restricting membership of a group is to allow anyone
|
|
|
join the group with supplementary confirmation of supervisors. It has no
|
|
|
additional benefits, so approach with public and private groups is implemented.
|
|
|
|
|
|
Supervisors using CodEx in their labs usually set minimum amount of points
|
|
|
required to get a credit. These points can be get by solving assigned exercises.
|
|
|
To visually show users if they already have enough points, ReCodEx groups
|
|
|
supports setting this limit. There are two equal ways how to set a limit --
|
|
|
absolute value or relative value to maximum. The latter way seems nicer, so it
|
|
|
is implemented. The relative value is set in percents and is called threshold.
|
|
|
|
|
|
Our university has a few partner grammar schools. There were an idea, that they
|
|
|
could use CodEx for teaching informatics classes. To make the setup simple for
|
|
|
them, all the software and hardware would be provided by university and hosted
|
|
|
in their datacentre. However, CodEx were not prepared to support this kind of
|
|
|
usage and no one had time to manage a separate instance. With ReCodEx it is
|
|
|
possible to offer hosted environment as a service to other subjects. The concept
|
|
|
we figured out is based on user and group separation inside the system. There
|
|
|
are multiple _instances_ in the system, which means unit of separation. Each
|
|
|
instance has own set of users and groups, exercises can be optionally shared.
|
|
|
Evaluation backend is common for all instances. To keep track of active
|
|
|
instances and paying customers, each instance must have a valid _license_ to
|
|
|
allow users submit their solutions. License is granted for defined period of
|
|
|
time and can be revoked in advance if the subject do not keep approved terms and
|
|
|
conditions.
|
|
|
|
|
|
The main work for the system is to evaluate programming exercises. The exercise
|
|
|
is quite similar to homework assignment during school labs. When a homework is
|
|
|
assigned, two things are important to know for users:
|
|
|
|
|
|
- description of the problem
|
|
|
- metadata -- when and whom to submit solutions, grading scale, penalties, etc.
|
|
|
|
|
|
To reflect this idea teachers and students are already familiar with, we decided
|
|
|
to keep separation between problem itself (_exercise_) and its _assignment_.
|
|
|
Exercise only describes one problem and provides testing data with description
|
|
|
of how to evaluate it. In fact, it is template for assignments. Assignment then
|
|
|
contains data from its exercise and additional metadata, which can be different
|
|
|
for every assignment of the same exercise. This separation is natural for all
|
|
|
users, in CodEx it is implemented in similar way and no other considerable
|
|
|
solution was found.
|
|
|
|
|
|
### Forgotten password
|
|
|
|
|
|
With authentication and some sort of dealing with passwords is related a problem
|
|
|
with forgotten credentials, especially passwords. People easily forget them and
|
|
|
there has to be some kind of mechanism to retrieve a new password or change the
|
|
|
old one. Problem is that it cannot be done in totally secure way, but we can at
|
|
|
least come quite close to it. First, there are absolutely not secure and
|
|
|
recommendable ways how to handle that, for example sending the old password
|
|
|
through email. A better, but still not secure solution is to generate a new one
|
|
|
and again send it through email. This solution was provided in CodEx, users had
|
|
|
to write an email to administrator, who generated a new password and sent it
|
|
|
back to the sender. This simple solution could be also automated, but
|
|
|
administrator had quite a big control over whole process. This might come in
|
|
|
handy if there could be some additional checkups for example, but on the other
|
|
|
hand it can be quite time consuming.
|
|
|
|
|
|
Probably the best solution which is often used and is fairly secure is
|
|
|
following. Let us consider only case in which all users have to fill their
|
|
|
email addresses into the system and these addresses are safely in the hands of
|
|
|
the right users. When user finds out that he/she does not remember a password,
|
|
|
he/she requests a password reset and fill in his/her unique identifier; it might
|
|
|
be email or unique nickname. Based on matched user account the system generates
|
|
|
unique access token and sends it to user via email address. This token should be
|
|
|
time limited and usable only once, so it cannot be misused. User then takes the
|
|
|
token or URL address which is provided in the email and go to the system's
|
|
|
appropriate section, where new password can be set. After that user can sign in
|
|
|
with his/her new password. As previously stated, this solution is quite safe and
|
|
|
user can handle it on its own, so administrator does not have to worry about it.
|
|
|
That is the main reason why this approach was chosen to be used.
|
|
|
|
|
|
### Evaluation unit executed by ReCodEx
|
|
|
|
|
|
One of the bigger requests for the new system is to support a complex
|
|
|
configuration of execution pipeline. The idea comes from lecturers of Compiler
|
|
|
principles class who want to migrate their semi-manual evaluation process to
|
|
|
CodEx. Unfortunately, CodEx is not capable of such complicated exercise setup.
|
|
|
None of evaluation systems we found can handle such task, so design from
|
|
|
scratch is needed.
|
|
|
|
|
|
There are two main approaches to design a complex execution configuration. It
|
|
|
can be composed of small amount of relatively big components or much more small
|
|
|
tasks. Big components are easy to write and whole configuration is reasonably
|
|
|
small. The components are designed for current problems, so it is not scalable
|
|
|
enough for pleasant future usage. This can be solved by introducing small set of
|
|
|
single-purposed tasks which can be composed together. The whole configuration is
|
|
|
then quite bigger, but with great adaptation ability for new conditions and also
|
|
|
less amount of work programming them. For better user experience, configuration
|
|
|
generators for some common cases can be introduced.
|
|
|
|
|
|
ReCodEx target is to be continuously developed and used for many years, so the
|
|
|
smaller tasks are the right choice. Observation of CodEx system shows that
|
|
|
only a few tasks are needed. In extreme case, only one task is enough -- execute
|
|
|
a binary. However, for better portability of configurations along different
|
|
|
systems it is better to implement reasonable subset of operations directly
|
|
|
without calling system provided binaries. These operations are copy file, create
|
|
|
new directory, extract archive and so on, altogether called internal tasks.
|
|
|
Another benefit from custom implementation of these tasks is guarantied safety,
|
|
|
so no sandbox needs to be used as in external tasks case.
|
|
|
|
|
|
For a job evaluation, the tasks needs to be executed sequentially in a specified
|
|
|
order. The idea of running independent tasks in parallel is bad because exact
|
|
|
time measurement needs controlled environment on target computer with
|
|
|
minimization of interrupts by other processes. It seems that connecting tasks
|
|
|
into directed acyclic graph (DAG) can handle all possible problem cases. None of
|
|
|
the authors, supervisors and involved faculty staff can think of a problem that
|
|
|
cannot be decomposed into tasks connected in a DAG. The goal of evaluation is
|
|
|
to satisfy as many tasks as possible. During execution there are sometimes
|
|
|
multiple choices of next task. To control that, each task can have a priority,
|
|
|
which is used as a secondary ordering criterion. For better understanding, here
|
|
|
is a small example.
|
|
|
|
|
|
![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png)
|
|
|
|
|
|
The _job root_ task is imaginary single starting point of each job. When the
|
|
|
_CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should
|
|
|
be deterministic by position in configuration file -- tasks stated earlier
|
|
|
should be executed earlier). The task priorities guaranties, that after
|
|
|
_CompileA_ task all dependent tasks are executed before _CompileB_ task (they
|
|
|
have higher priority number). For example this is useful to control which files
|
|
|
are present in a working directory at every moment. To sum up, there are 3
|
|
|
ordering criteria: dependencies, then priorities and finally position of task in
|
|
|
configuration. Together, they define a unambiguous linear ordering of all tasks.
|
|
|
|
|
|
For grading there are several important tasks. First, tasks executing submitted
|
|
|
code need to be checked for time and memory limits. Second, outputs of judging
|
|
|
tasks need to be checked for correctness (represented by return value or by data
|
|
|
on standard output) and should not fail on time or memory limits. This division
|
|
|
can be transparent for backend, each task is executed the same way. But frontend
|
|
|
must know which tasks from whole job are important and what is their kind. It is
|
|
|
reasonable, to keep this piece of information alongside the tasks in job
|
|
|
configuration, so each task can have a label about its purpose. Unlabeled tasks
|
|
|
have an internal type _inner_. There are four categories of tasks:
|
|
|
|
|
|
- _initiation_ -- setting up the environment, compiling code, etc.; for users
|
|
|
failure means error in their sources which are not compatible with running it
|
|
|
with examination data
|
|
|
- _execution_ -- running the user code with examination data, must not exceed
|
|
|
time and memory limits; for users failure means wrong design, slow data
|
|
|
structures, etc.
|
|
|
- _evaluation_ -- comparing user and examination outputs; for user failure means
|
|
|
that the program does not compute the right results
|
|
|
- _inner_ -- no special meaning for frontend, technical tasks for fetching and
|
|
|
copying files, creating directories, etc.
|
|
|
|
|
|
Each job is composed of multiple tasks of these types which are semantically
|
|
|
grouped into tests. A test can represent one set of examination data for user
|
|
|
code. To mark the grouping, another task label can be used. Each test must have
|
|
|
exactly one _evaluation_ task (to show success or failure to users) and
|
|
|
arbitrary number of tasks with other types.
|
|
|
|
|
|
### Evaluation progress state
|
|
|
|
|
|
Users surely want to know progress state of their submitted solution this kind
|
|
|
of functionality comes particularly handy in long duration exercises. Because of
|
|
|
reporting progress users have immediate knowledge if anything goes wrong, not
|
|
|
mention psychological effect that whole system and its parts are working and
|
|
|
doing something. That is why this feature was considered from beginning but
|
|
|
there are multiple ways how to look at it in particular.
|
|
|
|
|
|
The very first idea would be to provide progress state based on done messages
|
|
|
from compilation, execution and evaluation. Which is something what a lot of
|
|
|
evaluation systems are providing. These information are high level enough for
|
|
|
users and they probably know what is going on and executing right now. If
|
|
|
compilation fails users know that their solution is not compilable, if execution
|
|
|
fails there were some problems with their program. The clarity of this kind of
|
|
|
progress state is nice and understandable. But as we learnt ReCodEx has to have
|
|
|
more advanced execution pipeline there can be more compilations or more
|
|
|
executions. And in addition parts of the system which ensure execution of users
|
|
|
solutions do not have to precisely know what they are executing at the moment.
|
|
|
This kind of information may be meaningless for them.
|
|
|
|
|
|
That is why another solution of progress state was considered. As we know right
|
|
|
now one of the best ways how to ensure generality is to have jobs with
|
|
|
single-purpose tasks. These tasks can be anything, some internal operation or
|
|
|
execution of external and sandboxed program. Based on this there is one very
|
|
|
simple solution how to provide general progress state which should be
|
|
|
independent on task types. We know that job has some number of tasks which has
|
|
|
to be executed so we can send state info after execution of every task. And that
|
|
|
is how we get percentual completion of an execution. Yes, it is kind of boring
|
|
|
and standard way but on top of that there can be built something else and more
|
|
|
appealing to users.
|
|
|
|
|
|
So displaying progress to users can be done numerous ways. We have percentual
|
|
|
completion which is of course begging for simple solution which is displaying
|
|
|
only the percentage or some kind of standard graphical progress bar. But that is
|
|
|
too mainstream lets try something else. Very good idea is to have some kind of
|
|
|
puzzled image or images which will be composed together according to progress.
|
|
|
Nice way but kind of challenging if we do not have designer around. Another
|
|
|
original solution is to have database of random kind-of-funny statements which
|
|
|
will be displayed every time task is completed. It is easy enough for
|
|
|
implementation and even for making up these messages and it is quite new and
|
|
|
original. That is why this last solution was chosen for displaying progress
|
|
|
state.
|
|
|
|
|
|
### Results of evaluation
|
|
|
|
|
|
There are lot of things which deserves discussion concerning results of
|
|
|
evaluation, how they should be displayed, what should be visible or not and also
|
|
|
what kind of reward for users solutions should be chosen.
|
|
|
|
|
|
At first let us focus on all kinds of outputs from executed programs within job.
|
|
|
Out of discussion is that supervisors should be able to view almost all outputs
|
|
|
from solutions if they choose them to be visible and recorded. This feature is
|
|
|
critical in debugging either whole exercises or users solutions. But should it
|
|
|
be default behaviour to record every output? Absolutely not, supervisor should
|
|
|
have a choice to turn it on, but discarding the outputs has to be the default
|
|
|
option. Even without this functionality a file base around whole ReCodEx system
|
|
|
can become quite large and on top of that outputs from executed programs can be
|
|
|
sometimes very extensive. Storing this amount of data is inefficient and
|
|
|
unnecessary to most of the solutions. However, on supervisor request this
|
|
|
feature should be available.
|
|
|
|
|
|
More interesting question is what should regular users see from execution of
|
|
|
their solution. Simple answer is of course that they should not see anything
|
|
|
which is partly true. Outputs from their programs can be anything and users can
|
|
|
somehow analyze inputs or even redirect them to output. So outputs from
|
|
|
execution should not be visible at all or under very special circumstances. But
|
|
|
that is not so straightforward for compilation or other kinds of initiation,
|
|
|
where it really depends on the particular case. Generally it is quite harmless
|
|
|
to display user some kind of compilation error which can help a lot during
|
|
|
troubleshooting. Of course again this kind of functionality should be
|
|
|
configurable by supervisors and disabled by default. There is also the last kind
|
|
|
of tasks which can output some information which is evaluation tasks. Output of
|
|
|
these tasks is somehow important to whole system and again can contain some
|
|
|
information about inputs or reference outputs. So outputs of evaluation tasks
|
|
|
should not be visible to regular users too.
|
|
|
|
|
|
The overall concept of grading solutions was presented earlier. To briefly
|
|
|
remind that, backend returns only exact measured values (used time and memory,
|
|
|
return code of the judging task, ...) and on top of that one value is computed.
|
|
|
The way of this computation can be very different across supervisors, so it has
|
|
|
to be easily extendable. The best way is to provide interface, which can be
|
|
|
implemented and any sort of magic can return the final value.
|
|
|
|
|
|
We found out several computational possibilities. There is basic arithmetic,
|
|
|
weighted arithmetic, geometric and harmonic mean of results of each test (the
|
|
|
result is logical value succeeded/failed, optionally with weight), some kind of
|
|
|
interpolation of used amount of time for each test, the same with used memory
|
|
|
amount and surely many others. To keep the project simple, we decided to design
|
|
|
appropriate interface and implement only weighted arithmetic mean computation,
|
|
|
which is used in about 90% of all assignments. Of course, different scheme can
|
|
|
be chosen for every assignment and also can be configured -- for example
|
|
|
specifying test weights for implemented weighted arithmetic mean. Advanced ways
|
|
|
of computation can be implemented on demand when there is a real demand for
|
|
|
them.
|
|
|
|
|
|
To avoid assigning points for insufficient solutions (like only printing "File
|
|
|
error" which is the valid answer in two tests), a minimal point threshold can be
|
|
|
specified. It the solution is to get less points than specified, it will get
|
|
|
zero points instead. This functionality can be embedded into grading computation
|
|
|
algoritm itself, but it would have to be present in each implementation
|
|
|
separately, which is a bit ugly. So, this feature is separated from point
|
|
|
computation.
|
|
|
|
|
|
Automatic grading cannot reflect all aspects of submitted code. For example,
|
|
|
structuring the code, number and quality of comments and so on. To allow
|
|
|
supervisors bring these manually checked things into grading, there is a concept
|
|
|
of bonus points. They can be positive or negative. Generally the solution with
|
|
|
the most assigned points is marked for grading that particular assignment.
|
|
|
However, if supervisor is not satisfied with student solution (really bad code,
|
|
|
cheating, ...) he/she assigns the student negative bonus points. To prevent
|
|
|
overriding this decision by system choosing another solution with more points or
|
|
|
even student submitting the same code again which evaluates to more points,
|
|
|
supervisor can mark a particular solution as marked and used for grading instead
|
|
|
of solution with the most points.
|
|
|
|
|
|
### Persistence
|
|
|
|
|
|
Previous parts of analysis show that the system has to keep some state. This
|
|
|
could be user settings, group membership, evaluated assignments and so on. The
|
|
|
data have to be kept across restart, so persistence is important decision
|
|
|
factor. There are several ways how to save structured data:
|
|
|
|
|
|
- plain files
|
|
|
- NoSQL database
|
|
|
- relational database
|
|
|
|
|
|
Another important factor is amount and size of stored data. Our guess is about
|
|
|
1000 users, 100 exercises, 200 assignments per year and 400000 unique solutions
|
|
|
per year. The data are mostly structured and there are a lot of them with the
|
|
|
same format. For example, there is a thousand of users and each one has the same
|
|
|
values -- name, email, age, etc. These kind of data are relatively small, name
|
|
|
and email are short strings, age is an integer. Considering this, relational
|
|
|
databases or formatted plain files (CSV for example) fits best for them.
|
|
|
However, the data often have to support find operation, so they have to be
|
|
|
sorted and allow random access for resolving cross references. Also, addition a
|
|
|
deletion of entries should take reasonable time (at most logarithmic time
|
|
|
complexity to number of saved values). This practically excludes plain files, so
|
|
|
relational database is used instead.
|
|
|
|
|
|
On the other hand, there are some data with no such great structure and much
|
|
|
larger size. These can be evaluation logs, sample input files for exercises or
|
|
|
submitted sources by students. Saving this kind of data into relational database
|
|
|
is not suitable, but it is better to keep them as ordinary files or store them
|
|
|
into some kind of NoSQL database. Since they are already files and does not need
|
|
|
to be backed up in multiple copies, it is easier to keep them as ordinary files
|
|
|
in filesystem. Also, this solution is more lightweight and does not require
|
|
|
additional dependencies on third-party software. File can be identified using
|
|
|
its filesystem path or unique index stored as value in relational database. Both
|
|
|
approaches are equally good, final decision depends on actual case.
|
|
|
|
|
|
|
|
|
## Structure of the project
|
|
|
|
|
|
There are numerous ways how to divide some sort of system into separated
|
|
|
services, from one single component to many and many single-purpose components.
|
|
|
Having only one big service is not feasible, not scalable enough and mainly it
|
|
|
would be one big blob of code which somehow works and is very complex, so this
|
|
|
is not the way. The quite opposite, having a lot of single-purpose components is
|
|
|
also somehow impractical. It is scalable by default and all services would have
|
|
|
quite simple code but on the other hand communication requirements for such
|
|
|
solution would be insane. So there has to be chosen approach which is somehow in
|
|
|
the middle, that means services have to communicate in manner which will not
|
|
|
bring network down, code basis should be reasonable and the whole system has to
|
|
|
be scalable enough. With this being said there can be discussion over particular
|
|
|
division for ReCodEx system.
|
|
|
|
|
|
The ReCodEx project is divided into two logical parts – the *backend* and the
|
|
|
*frontend* – which interact which each other and which cover the whole area of
|
|
|
code examination. Both of these logical parts are independent of each other in
|
|
|
the sense of being installed on separate machines at different locations and
|
|
|
that one of the parts can be replaced with a different implementation and as
|
|
|
long as the communication protocols are preserved, the system will continue
|
|
|
working as expected.
|
|
|
|
|
|
*Backend* is the part which is responsible solely for the process of evaluation
|
|
|
a solution of an exercise. Each evaluation of a solution is referred to as a
|
|
|
*job*. For each job, the system expects a configuration document of the job,
|
|
|
supplementary files for the exercise (e.g., test inputs, expected outputs,
|
|
|
predefined header files), and the solution of the exercise (typically source
|
|
|
codes created by a student). There might be some specific requirements for the
|
|
|
job, such as a specific runtime environment, specific version of a compiler or
|
|
|
the job must be evaluated on a processor with a specific number of cores. The
|
|
|
backend infrastructure decides whether it will accept a job or decline it based
|
|
|
on the specified requirements. In case it accepts the job, it will be placed in
|
|
|
a queue and it will be processed as soon as possible. The backend publishes the
|
|
|
progress of processing of the queued jobs and the results of the evaluations can
|
|
|
be queried after the job processing is finished. The backend produces a log of
|
|
|
the evaluation and scores the solution based on the job configuration document.
|
|
|
|
|
|
From the scalable point of view there are two necessary components, the one
|
|
|
which will execute jobs and component which will distribute jobs to the
|
|
|
instances of the first one. This ensures scalability in manner of parallel
|
|
|
execution of numerous jobs which is exactly what is needed. Implementation of
|
|
|
these services are called **broker** and **worker**, first one handles
|
|
|
distribution, latter execution. These components should be enough to fulfill all
|
|
|
above said, but for the sake of simplicity and better communication gateways
|
|
|
with frontend two other components were added, **fileserver** and **monitor**.
|
|
|
Fileserver is simple component whose purpose is to store files which are
|
|
|
exchanged between frontend and backend. Monitor is also quite simple service
|
|
|
which is able to serve job progress state from worker to web application. These
|
|
|
two additional services are on the edge of frontend and backend (like gateways)
|
|
|
but logically they are more connected with backend, so it is considered they
|
|
|
belong there.
|
|
|
|
|
|
*Frontend* on the other hand is responsible for the communication with the users
|
|
|
and provides them a convenient access to the backend infrastructure. The
|
|
|
frontend manages user accounts and gathers them into units called groups. There
|
|
|
is a database of exercises which can be assigned to the groups and the users of
|
|
|
these groups can submit their solutions for these assignments. The frontend will
|
|
|
initiate evaluation of these solutions by the backend and it will store the
|
|
|
results afterwards. The results will be visible to authorized users and the
|
|
|
results will be awarded with points according to the score given by the backend
|
|
|
in the evaluation process. The supervisors of the groups can edit the parameters
|
|
|
of the assignments, review the solutions and the evaluations in detail and award
|
|
|
the solutions with bonus points (both positive and negative) and discuss about
|
|
|
the solution with the author of the solution. Some of the users can be entitled
|
|
|
to create new exercises and extend the database of exercises which can be
|
|
|
assigned to the groups later on.
|
|
|
|
|
|
There are two main purposes of frontend -- holding the state of whole system
|
|
|
(database of users, exercises, solutions, points, etc.) and presenting the state
|
|
|
to users through some kind of an user interface (e.g., a web application, mobile
|
|
|
application, or a command-line tool). According to contemporary trends in
|
|
|
development of frontend parts of applications, we decided to split the frontend
|
|
|
in two logical parts -- a server side and a client side. The server side is
|
|
|
responsible for managing the state and the client side gives instructions to the
|
|
|
server side based on the inputs from the user. This decoupling gives us the
|
|
|
ability to create multiple client side tools which may address different needs
|
|
|
of the users.
|
|
|
|
|
|
The frontend developed as part of this project is a web application created with
|
|
|
the needs of the Faculty of Mathematics and Physics of the Charles university in
|
|
|
Prague in mind. The users are the students and their teachers, groups correspond
|
|
|
to the different courses, the teachers are the supervisors of these groups. We
|
|
|
believe that this model is applicable to the needs of other universities,
|
|
|
schools, and IT companies, which can use the same system for their needs. It is
|
|
|
also possible to develop their own frontend with their own user management
|
|
|
system for their specific needs and use the possibilities of the backend without
|
|
|
any changes, as was mentioned in the previous paragraphs.
|
|
|
|
|
|
One possible configuration of ReCodEx system is illustrated on following
|
|
|
picture, where there is one shared backend with three workers and two separate
|
|
|
instances of whole frontend. This configuration may be suitable for MFF UK --
|
|
|
basic programming course and KSP competition. But maybe even sharing web API and
|
|
|
fileserver with only custom instances of client (web app or own implementation)
|
|
|
is more likely to be used. Note, that connections between components are not
|
|
|
fully accurate.
|
|
|
|
|
|
![Overall architecture](https://github.com/ReCodEx/wiki/blob/master/images/Overall_Architecture.png)
|
|
|
|
|
|
In the latter parts of the documentation, both of the backend and frontend parts
|
|
|
will be introduced separately and covered in more detail. The communication
|
|
|
protocol between these two logical parts will be described as well.
|
|
|
|
|
|
|
|
|
## Implementation analysis
|
|
|
|
|
|
When developing a project like ReCodEx there has to be some discussion over
|
|
|
implementation details and how to solve some particular problems properly.
|
|
|
This discussion is a never ending story which is done through the whole
|
|
|
development process. Some of the most important implementation problems or
|
|
|
interesting observations will be discussed in this chapter.
|
|
|
|
|
|
### General communication
|
|
|
|
|
|
Overall design of the project is discussed above. There are bunch of components
|
|
|
with their own responsibility. Important thing to design is communication of
|
|
|
these components. All we can count with is that they are connected by network.
|
|
|
|
|
|
To choose a suitable protocol, there are some additional requirements that
|
|
|
should be met:
|
|
|
|
|
|
- reliability -- if a message is sent between components, the protocol has to
|
|
|
ensure that it is received by target component
|
|
|
- working over IP protocol
|
|
|
- multi-platform and multi-language usage
|
|
|
|
|
|
TCP/IP protocol meets these conditions, however it is quite low level and
|
|
|
working with it usually requires working with platform dependent non-object API.
|
|
|
Often way to reflect these reproaches is to use some framework which provides
|
|
|
better abstraction and more suitable API. We decided to go this way, so the
|
|
|
following options are considered:
|
|
|
|
|
|
- CORBA -- Corba is a well known framework for remote object invocation. There
|
|
|
are multiple implementations for almost every known programming language. It
|
|
|
fits nicely into object oriented programming environment.
|
|
|
- RabbitMQ -- RabbitMQ is a messaging framework written in Erlang. It has
|
|
|
bindings to huge number of languages and large community. Also, it is capable
|
|
|
of routing requests, which could be handy feature for job loadbalancing.
|
|
|
- ZeroMQ -- ZeroMQ is another messaging framework, but instead of creating
|
|
|
separate service this is a small library which can be embedded into own
|
|
|
projects. It is written in C++ with huge number of bindings.
|
|
|
|
|
|
We like CORBA, but our system should be more loosely-coupled, so (asynchronous)
|
|
|
messaging is better approach in our minds. RabbitMQ seems nice with great
|
|
|
advantage of routing capability, but it is quite heavy service written in
|
|
|
language no one from the team knows, so we do not like it much. ZeroMQ is the
|
|
|
best option for us. However, all of the three options would have been possible
|
|
|
to use.
|
|
|
|
|
|
Frontend communication follows the choice, that ReCodEx should be primary a web
|
|
|
application. The communication protocol has to reflect client-server
|
|
|
architecture. There are several options:
|
|
|
|
|
|
- *TCP sockets* -- TCP sockets give a reliable means of a full-duplex
|
|
|
communication. All major operating systems support this protocol and there are
|
|
|
libraries which simplify the implementation. On the other side, it is not
|
|
|
possible to initiate a TCP socket from a web browser.
|
|
|
- *WebSockets* -- The WebSocket standard is built on top of TCP. It enables a
|
|
|
web browser to connect to a server over a TCP socket. WebSockets are
|
|
|
implemented in recent versions of all modern web browsers and there are
|
|
|
libraries for several programming languages like Python or JavaScript (running
|
|
|
in Node.js). Encryption of the communication over a WebSocket is supported as
|
|
|
a standard.
|
|
|
- *HTTP protocol* -- The HTTP protocol is a state-less protocol implemented on
|
|
|
top of the TCP protocol. The communication between the client and server
|
|
|
consists of a requests sent by the client and responses to these requests sent
|
|
|
back by the sever. The client can send as many requests as needed and it may
|
|
|
ignore the responses from the server, but the server must respond only to the
|
|
|
requests of the client and it cannot initiate communication on its own.
|
|
|
End-to-end encryption can be achieved easily using SSL (HTTPS).
|
|
|
|
|
|
We chose the HTTP(S) protocol because of the simple implementation in all sorts
|
|
|
of operating systems and runtime environments on both the client and the server
|
|
|
side.
|
|
|
|
|
|
The API of the server should expose basic CRUD (Create, Read, Update, Delete)
|
|
|
operations. There are some options on what kind of messages to send over the
|
|
|
HTTP:
|
|
|
|
|
|
- SOAP -- a protocol for exchanging XML messages. It is very robust and complex.
|
|
|
- REST -- is a stateless architecture style, not a protocol or a technology. It
|
|
|
relies on HTTP (but not necessarily) and its method verbs (e.g., GET, POST,
|
|
|
PUT, DELETE). It can fully implement the CRUD operations.
|
|
|
|
|
|
Even though there are some other technologies we chose the REST style over the
|
|
|
HTTP protocol. It is widely used, there are many tools available for development
|
|
|
and testing, and it is understood by programmers so it should be easy for a new
|
|
|
developer with some experience in client-side applications to get to know with
|
|
|
the ReCodEx API and develop a client application.
|
|
|
|
|
|
To sum up, chosen ways of communication inside the ReCodEx system are captured
|
|
|
in the following image. Red connections are through ZeroMQ sockets, blue are
|
|
|
through WebSockets and green are through HTTP(S).
|
|
|
|
|
|
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
|
|
|
|
|
|
### Broker
|
|
|
|
|
|
The broker is responsible for keeping track of available workers and
|
|
|
distributing jobs that it receives from the frontend between them.
|
|
|
|
|
|
#### Worker management
|
|
|
|
|
|
It is intended for the broker to be a fixed part of the backend infrastructure
|
|
|
to which workers connect at will. Thanks to this design, workers can be added
|
|
|
and removed when necessary (and possibly in an automated fashion), without
|
|
|
changing the configuration of the broker. An alternative solution would be
|
|
|
configuring a list of workers before startup, thus making them passive in the
|
|
|
communication (in the sense that they just wait for incoming jobs instead of
|
|
|
connecting to the broker). However, this approach comes with a notable
|
|
|
administration overhead -- in addition to starting a worker, the administrator
|
|
|
would have to update the worker list.
|
|
|
|
|
|
Worker management must also take into account the possibility of worker
|
|
|
disconnection, either because of a network or software failure (or termination).
|
|
|
A common way to detect such events in distributed systems is to periodically
|
|
|
send short messages to other nodes and expect a response. When these messages
|
|
|
stop arriving, we presume that the other node encountered a failure. Both the
|
|
|
broker and workers can be made responsible for initiating these exchanges and it
|
|
|
seems that there are no differences stemming from this choice. We decided that
|
|
|
the workers will be the active party that initiates the exchange.
|
|
|
|
|
|
#### Scheduling
|
|
|
|
|
|
Jobs should be scheduled in a way that ensures that they will be processed
|
|
|
without unnecessary waiting. This depends on the fairness of the scheduling
|
|
|
algorithm (no worker machine should be overloaded).
|
|
|
|
|
|
The design of such scheduling algorithm is complicated by the requirements on
|
|
|
the diversity of workers -- they can differ in operating systems, available
|
|
|
software, computing power and many other aspects.
|
|
|
|
|
|
We decided to keep the details of connected workers hidden from the frontend,
|
|
|
which should lead to a better separation of responsibilities and flexibility.
|
|
|
Therefore, the frontend needs a way of communicating its requirements on the
|
|
|
machine that processes a job without knowing anything about the available
|
|
|
workers. A key-value structure is suitable for representing such requirements.
|
|
|
|
|
|
With respect to these constraints, and because the analysis and design of a more
|
|
|
sophisticated solution was declared out of scope of our project assignment, a
|
|
|
rather simple scheduling algorithm was chosen. The broker shall maintain a queue
|
|
|
of available workers. When assigning a job, it traverses this queue and chooses
|
|
|
the first machine that matches the requirements of the job. This machine is then
|
|
|
moved to the end of the queue.
|
|
|
|
|
|
Presented algorithm results in a simple round-robin load balancing strategy,
|
|
|
which should be sufficient for small-scale deployments (such as a single
|
|
|
university). However, with a large amount of jobs, some workers will easily
|
|
|
become overloaded. The implementation must allow for a simple replacement of the
|
|
|
load balancing strategy so that this problem can be solved in the near future.
|
|
|
|
|
|
#### Forwarding jobs
|
|
|
|
|
|
Information about a job can be divided in two disjoint parts -- what the worker
|
|
|
needs to know to process it and what the broker needs to forward it to the
|
|
|
correct worker. It remains to be decided how this information will be
|
|
|
transferred to its destination.
|
|
|
|
|
|
It is technically possible to transfer all the data required by the worker at
|
|
|
once through the broker. This package could contain submitted files, test
|
|
|
data, requirements on the worker, etc. A drawback of this solution is that
|
|
|
both submitted files and test data can be rather large. Furthermore, it is
|
|
|
likely that test data would be transferred many times.
|
|
|
|
|
|
Because of these facts, we decided to store data required by the worker using a
|
|
|
shared storage space and only send a link to this data through the broker. This
|
|
|
approach leads to a more efficient network and resource utilization (the broker
|
|
|
doesn't have to process data that it doesn't need), but also makes the job
|
|
|
submission flow more complicated.
|
|
|
|
|
|
#### Further requirements
|
|
|
|
|
|
The broker can be viewed as a central point of the backend. While it has only
|
|
|
two primary, closely related responsibilities, other requirements have arisen
|
|
|
(forwarding messages about job evaluation progress back to the frontend) and
|
|
|
will arise in the future. To facilitate such requirements, its architecture
|
|
|
should allow simply adding new communication flows. It should also be as
|
|
|
asynchronous as possible to enable efficient communication with external
|
|
|
services, for example via HTTP.
|
|
|
|
|
|
### Worker
|
|
|
|
|
|
Worker is component which is supposed to execute incoming jobs from broker. As
|
|
|
such worker should work and support wide range of different infrastructures and
|
|
|
maybe even platforms/operating systems. Support of at least two main operating
|
|
|
systems is desirable and should be implemented. Worker as a service does not
|
|
|
have to be much complicated, but a bit of complex behaviour is needed. Mentioned
|
|
|
complexity is almost exclusively concerned about robust communication with
|
|
|
broker which has to be regularly checked. Ping mechanism is usually used for
|
|
|
this in all kind of projects. This means that worker should be able to send ping
|
|
|
messages even during execution. So worker has to be divided into two separate
|
|
|
parts, the one which will handle communication with broker and the another which
|
|
|
will execute jobs. The easiest solution is to have these parts in separate
|
|
|
threads which somehow tightly communicates with each other. For inter process
|
|
|
communication there can be used numerous technologies, from shared memory to
|
|
|
condition variables or some kind of in-process messages. Already used library
|
|
|
ZeroMQ is possible to provide in-process messages working on the same principles
|
|
|
as network communication which is quite handy and solves problems with threads
|
|
|
synchronization and such.
|
|
|
|
|
|
At this point we have worker with two internal parts listening one and execution
|
|
|
one. Implementation of first one is quite straightforward and clear. So lets
|
|
|
discuss what should be happening in execution subsystem. Jobs as work units can
|
|
|
quite vary and do completely different things, that means configuration and
|
|
|
worker has to be prepared for this kind of generality. Configuration and its
|
|
|
solution was already discussed above, implementation in worker is then quite
|
|
|
also quite straightforward. Worker has internal structures to which loads and
|
|
|
which stores metadata given in configuration. Whole job is mapped to job
|
|
|
metadata structure and tasks are mapped to either external ones or internal ones
|
|
|
(internal commands has to be defined within worker), both are different whether
|
|
|
they are executed in sandbox or as internal worker commands.
|
|
|
|
|
|
Another division of tasks is by task-type field in configuration. This field can
|
|
|
have four values: initiation, execution, evaluation and inner. All was discussed
|
|
|
and described above in configuration analysis. What is important to worker is
|
|
|
how to behave if execution of task with some particular type fails. There are
|
|
|
two possible situations execution fails due to bad user solution or due to some
|
|
|
internal error. If execution fails on internal error solution cannot be declared
|
|
|
overly as failed. User should not be punished for bad configuration or some
|
|
|
network error. This is where task types are useful. Generally initiation,
|
|
|
execution and evaluation are tasks which are somehow executing code which was
|
|
|
given by users who submitted solution of exercise. If this kinds of tasks fail
|
|
|
it is probably connected with bad user solution and can be evaluated. But if
|
|
|
some inner task fails solution should be re-executed, in best case scenario on
|
|
|
different worker. That is why if inner task fails it is sent back to broker
|
|
|
which will reassign job to another worker. More on this subject should be
|
|
|
discussed in broker assigning algorithms section.
|
|
|
|
|
|
There is also question about working directory or directories of job, which
|
|
|
directories should be used and what for. There is one simple answer on this
|
|
|
every job will have only one specified directory which will contain every file
|
|
|
with which worker will work in the scope of whole job execution. This is of
|
|
|
course nonsense there has to be some logical division. The least which must be
|
|
|
done are two folders one for internal temporary files and second one for
|
|
|
evaluation. The directory for temporary files is enough to comprehend all kind
|
|
|
of internal work with filesystem but only one directory for whole evaluation is
|
|
|
somehow not enough. Users solutions are downloaded in form of zip archives so
|
|
|
why these should be present during execution or why the results and files which
|
|
|
should be uploaded back to fileserver should be cherry picked from the one big
|
|
|
directory? The answer is of course another logical division into subfolders. The
|
|
|
solution which was chosen at the end is to have folders for downloaded archive,
|
|
|
decompressed solution, evaluation directory in which user solution is executed
|
|
|
and then folders for temporary files and for results and generally files which
|
|
|
should be uploaded back to fileserver with solution results. Of course there has
|
|
|
to be hierarchy which separate folders from different workers on the same
|
|
|
machines. That is why paths to directories are in format:
|
|
|
`${DEFAULT}/${FOLDER}/${WORKER_ID}/${JOB_ID}` where default means default
|
|
|
working directory of whole worker, folder is particular directory for some
|
|
|
purpose (archives, evaluation, ...). Mentioned division of job directories
|
|
|
proved to be flexible and detailed enough, everything is in logical units and
|
|
|
where it is supposed to be which means that searching through this system should
|
|
|
be easy. In addition if solutions of users have access only to evaluation
|
|
|
directory then they do not have access to unnecessary files which is better for
|
|
|
overall security of whole ReCodEx.
|
|
|
|
|
|
As we discovered above worker has job directories but users who are writing and
|
|
|
managing job configurations do not know where they are (on some particular
|
|
|
worker) and how they can be accessed and written into configuration. For this
|
|
|
kind of task we have to introduce some kind of marks or signs which will
|
|
|
represent particular folders. Marks or signs can have form of some kind of
|
|
|
special strings which can be called variables. These variables then can be used
|
|
|
everywhere where filesystem paths are used within configuration file. This will
|
|
|
solve problem with specific worker environment and specific hierarchy of
|
|
|
directories. Final form of variables is `${...}` where triple dot is textual
|
|
|
description. This format was used because of special dollar sign character which
|
|
|
cannot be used within filesystem path, braces are there only to border textual
|
|
|
description of variable.
|
|
|
|
|
|
#### Evaluation
|
|
|
|
|
|
After successful arrival of job, worker has to prepare new execution
|
|
|
environment, then solution archive has to be downloaded from fileserver and
|
|
|
extracted. Job configuration is located within these files and loaded into
|
|
|
internal structures and executed. After that results are uploaded back to
|
|
|
fileserver. These steps are the basic ones which are really necessary for whole
|
|
|
execution and have to be executed in this precise order.
|
|
|
|
|
|
Interesting problem is with supplementary files (inputs, sample outputs). There
|
|
|
are two approaches which can be observed. Supplementary files can be downloaded
|
|
|
either on the start of the execution or during execution. If the files are
|
|
|
downloaded at the beginning execution does not really started at this point and
|
|
|
if there are problems with network worker find it right away and can abort
|
|
|
execution without executing single task. Slight problems can arise if some of
|
|
|
the files needs to have same name (e.g. solution assumes that input is
|
|
|
`input.txt`), in this scenario downloaded files cannot be renamed at the
|
|
|
beginning but during execution which is somehow impractical and not easily
|
|
|
observed. Second solution of this problem when files are downloaded on the fly
|
|
|
has quite opposite problem, if there are problems with network worker will find
|
|
|
it during execution when for instance almost whole execution is done, this is
|
|
|
also not ideal solution if we care about burnt hardware resources. On the other
|
|
|
hand using this approach users have quite advanced control of execution flow and
|
|
|
know what files exactly are available during execution which is from users
|
|
|
perspective probably more appealing then the first solution. Based on that
|
|
|
downloading of supplementary files using 'fetch' tasks during execution was
|
|
|
chosen and implemented.
|
|
|
|
|
|
#### Caching mechanism
|
|
|
|
|
|
Worker can use caching mechanism based on files from fileserver under one
|
|
|
condition, provided files has to have unique name. If uniqueness is fulfilled
|
|
|
then precious bandwidth can be saved using cache. This means there has to be
|
|
|
system which can download file, store it in cache and after some time of
|
|
|
inactivity delete it. Because there can be multiple worker instances on some
|
|
|
particular server it is not efficient to have this system in every worker on its
|
|
|
own. So it is feasible to have this feature somehow shared among all workers on
|
|
|
the same machine. Solution may be again having separate service connected
|
|
|
through network with workers which would provide such functionality but this
|
|
|
would mean component with another communication for the purpose where it is not
|
|
|
exactly needed. But mainly it would be single-failure component if it would stop
|
|
|
working it is quite problem. So there was chosen another solution which assumes
|
|
|
worker has access to specified cache folder, to this folder worker can download
|
|
|
supplementary files and copy them from here. This means every worker has the
|
|
|
possibility to maintain downloads to cache, but what is worker not able to
|
|
|
properly do is deletion of unused files after some time. For that single-purpose
|
|
|
component is introduced which is called 'cleaner'. It is simple script executed
|
|
|
within cron which is able to delete files which were unused for some time.
|
|
|
Together with worker fetching feature cleaner completes machine specific caching
|
|
|
system.
|
|
|
|
|
|
Cleaner as mentioned is simple script which is executed regularly as cron job.
|
|
|
If there is caching system like it was introduced in paragraph above there are
|
|
|
little possibilities how cleaner should be implemented. On various filesystems
|
|
|
there is usually support for two particular timestamps, `last access time` and
|
|
|
`last modification time`. Files in cache are once downloaded and then just
|
|
|
copied, this means that last modification time is set only once on creation of
|
|
|
file and last access time should be set every time on copy. This imply last
|
|
|
access time is what is needed here. But last modification time is widely used by
|
|
|
operating systems, on the other hand last access time is not by default. More on
|
|
|
this subject can be found
|
|
|
[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime).
|
|
|
For proper cleaner functionality filesystem which is used by worker for caching
|
|
|
has to have last access time for files enabled.
|
|
|
|
|
|
Having cleaner as separated component and caching itself handled in worker is
|
|
|
kind of blurry and is not clearly observable that it works without any race
|
|
|
conditions. The goal here is not to have system without races but to have system
|
|
|
which can recover from them. Implementation of caching system is based upon
|
|
|
atomic operations of underlying filesystem. Follows description of one possible
|
|
|
robust implementation. First start with worker implementation:
|
|
|
|
|
|
- worker discovers fetch task which should download supplementary file
|
|
|
- worker takes name of file and tries to copy it from cache folder to its
|
|
|
working folder
|
|
|
- if successful then last access time should be rewritten (by filesystem
|
|
|
itself) and whole operation is done
|
|
|
- if not successful then file has to be downloaded
|
|
|
- file is downloaded from fileserver to working folder
|
|
|
- downloaded file is then copied to cache
|
|
|
|
|
|
Previous implementation is only within worker, cleaner can anytime intervene and
|
|
|
delete files. Implementation in cleaner follows:
|
|
|
|
|
|
- cleaner on its start stores current reference timestamp which will be used for
|
|
|
comparison and load configuration values of caching folder and maximal file
|
|
|
age
|
|
|
- there is a loop going through all files and even directories in specified
|
|
|
cache folder
|
|
|
- last access time of file or folder is detected
|
|
|
- last access time is subtracted from reference timestamp into
|
|
|
difference
|
|
|
- difference is compared against specified maximal file age, if
|
|
|
difference is greater, file or folder is deleted
|
|
|
|
|
|
Previous description implies that there is gap between detection of last access
|
|
|
time and deleting file within cleaner. In the gap there can be worker which will
|
|
|
access file and the file is anyway deleted but this is fine, file is deleted but
|
|
|
worker has it copied. Another problem can be with two workers downloading the
|
|
|
same file, but this is also not a problem file is firstly downloaded to working
|
|
|
folder and after that copied to cache. And even if something else unexpectedly
|
|
|
fails and because of that fetch task will fail during execution even that should
|
|
|
be fine. Because fetch tasks should have 'inner' task type which implies that
|
|
|
fail in this task will stop all execution and job will be reassigned to another
|
|
|
worker. It should be like the last salvation in case everything else goes wrong.
|
|
|
|
|
|
#### Sandboxing
|
|
|
|
|
|
There are numerous ways how to approach sandboxing on different platforms,
|
|
|
describing all possible approaches is out of scope of this document. Instead of
|
|
|
that have a look at some of the features which are certainly needed for ReCodEx
|
|
|
and propose some particular sandboxes implementations on Linux or Windows.
|
|
|
|
|
|
General purpose of sandbox is safely execute software in any form, from scripts
|
|
|
to binaries. Various sandboxes differ in how safely are they and what limiting
|
|
|
features they have. Ideal situation is that sandbox will have numerous options
|
|
|
and corresponding features which will allow administrators to setup environment
|
|
|
as they like and which will not allow user programs to somehow damage executing
|
|
|
machine in any way possible.
|
|
|
|
|
|
For ReCodEx and its evaluation there is need for at least these features:
|
|
|
execution time and memory limitation, disk operations limit, disk accessibility
|
|
|
restrictions and network restrictions. All these features if combined and
|
|
|
implemented well are giving pretty safe sandbox which can be used for all kinds
|
|
|
of users solutions and should be able to restrict and stop any standard way of
|
|
|
attacks or errors.
|
|
|
|
|
|
Linux systems have quite extent support of sandboxing in kernel, there were
|
|
|
introduced and implemented kernel namespaces and cgroups which combined can
|
|
|
limit hardware resources (cpu, memory) and separate executing program into its
|
|
|
own namespace (pid, network). These two features comply sandbox requirement for
|
|
|
ReCodEx so there were two options, either find existing solution or implement
|
|
|
new one. Luckily existing solution was found and its name is **isolate**.
|
|
|
Isolate does not use all possible kernel features but only subset which is still
|
|
|
enough to be used by ReCodEx.
|
|
|
|
|
|
The opposite situation is in Windows world, there is limited support in its
|
|
|
kernel which makes sandboxing a bit trickier. Windows kernel only has ways how
|
|
|
to restrict privileges of a process through restriction of internal access
|
|
|
tokens. Monitoring of hardware resources is not possible but used resources can
|
|
|
be obtained through newly created job objects. But find sandbox which can do all
|
|
|
things needed for ReCodEx seems to be impossible. There are numerous sandboxes
|
|
|
for Windows but they all are focused on different things in a lot of cases they
|
|
|
serves as safe environment for malicious programs, viruses in particular. Or
|
|
|
they are designed as a separate filesystem namespace for installing a lot of
|
|
|
temporarily used programs. From all these we can mention Sandboxie, Comodo
|
|
|
Internet Security, Cuckoo sandbox and many others. None of these is fitted as
|
|
|
sandbox solution for ReCodEx. With this being said we can safely state that
|
|
|
designing and implementing new general sandbox for Windows is out of scope of
|
|
|
this project.
|
|
|
|
|
|
New general sandbox for Windows is out of business but what about more
|
|
|
specialized solution used for instance only for C#. CLR as a virtual machine and
|
|
|
runtime environment has a pretty good security support for restrictions and
|
|
|
separation which is also transferred to C#. This makes it quite easy to
|
|
|
implement simple sandbox within C# but surprisingly there cannot be found some
|
|
|
well known general purpose implementations. As said in previous paragraph
|
|
|
implementing our own solution is out of scope of project there is simple not
|
|
|
enough time. But C# sandbox is quite good topic for another project for example
|
|
|
term project for C# course so it might be written and integrated in future.
|
|
|
|
|
|
### Fileserver
|
|
|
|
|
|
The fileserver provides access to a shared storage space that contains files
|
|
|
submitted by students, supplementary files such as test inputs and outputs and
|
|
|
results of evaluation. In other words, it acts as an intermediate node for data
|
|
|
passed between the frontend and the backend. This functionality can be easily
|
|
|
separated from the rest of the backend features, which led to designing the
|
|
|
fileserver as a standalone component. Such design helps encapsulate the details
|
|
|
of how the files are stored (e.g. on a file system, in a database or using a
|
|
|
cloud storage service), while also making it possible to share the storage
|
|
|
between multiple ReCodEx frontends.
|
|
|
|
|
|
For early releases of the system, we chose to store all files on the file system
|
|
|
-- it is the least complicated solution (in terms of implementation complexity)
|
|
|
and the storage backend can be rather easily migrated to a different technology.
|
|
|
|
|
|
One of the facts we learned from CodEx is that many exercises share test input
|
|
|
and output files, and also that these files can be rather large (hundreds of
|
|
|
megabytes). A direct consequence of this is that we cannot add these files to
|
|
|
submission archives that are to be downloaded by workers -- the combined size of
|
|
|
the archives would quickly exceed gigabytes, which is impractical. Another
|
|
|
conclusion we made is that a way to deal with duplicate files must be
|
|
|
introduced.
|
|
|
|
|
|
A simple solution to this problem is storing supplementary files under the
|
|
|
hashes of their content. This ensures that every file is stored only once. On
|
|
|
the other hand, it makes it more difficult to understand what the content of a
|
|
|
file is at a glance, which might prove problematic for the administrator.
|
|
|
|
|
|
A notable part of the fileserver's work is done by a web server (e.g. listening
|
|
|
to HTTP requests and caching recently accessed files in memory for faster
|
|
|
access). What remains to be implemented is handling requests that upload files
|
|
|
-- student submissions should be stored in archives to facilitate simple
|
|
|
downloading and supplementary exercise files need to be stored under their
|
|
|
hashes.
|
|
|
|
|
|
We decided to use Python and the Flask web framework. This combination makes it
|
|
|
possible to express the logic in ~100 SLOC and also provides means to run the
|
|
|
fileserver as a standalone service (without a web server), which is useful for
|
|
|
development.
|
|
|
|
|
|
### Monitor
|
|
|
|
|
|
Users want to view real time evaluation progress of their solution. It can be
|
|
|
easily done with established double-sided connection stream, but it is hard to
|
|
|
achieve with web technologies. HTTP protocol works differently on separate
|
|
|
requests basis with no long term connection. However, there is widely used
|
|
|
technology to solve this problem, WebSocket protocol.
|
|
|
|
|
|
Working with WebSocket protocol from the backend is possible, but not ideal from
|
|
|
design point of view. Backend should be hidden from public internet to minimize
|
|
|
surface for possible attacks. With this in mind, there are two possible options:
|
|
|
|
|
|
- send progress messages through API
|
|
|
- make separate component for progress messages
|
|
|
|
|
|
Each of the two possibilities has some pros and cons. The first one is good
|
|
|
because there is no additional component and API is already publicly visible. On
|
|
|
the other side, working with WebSocket protocol from PHP is not much pleasant
|
|
|
(but it is possible) and embedding this functionality into API is not
|
|
|
extendable. The second approach is better for future changing the protocol or
|
|
|
implementing extensions like caching of messages. Also, the progress feature is
|
|
|
considered only optional, because there may be clients for which this feature is
|
|
|
useless. Major drawback of separate component is another part, which needs to
|
|
|
be publicly exposed.
|
|
|
|
|
|
We decided to make a separate component, mainly because it is smaller component
|
|
|
with only one role, better maintainability and optional demands for progress
|
|
|
callback.
|
|
|
|
|
|
There are several possibilities how to write the component. Notably, considered
|
|
|
options were already used languages C++, PHP, JavaScript and Python. At the end,
|
|
|
the Python language was chosen for its simplicity, great support for all used
|
|
|
technologies and also there are free Python developers in out team. Then,
|
|
|
responsibility of this component is determined. Concept of message flow is on
|
|
|
following picture.
|
|
|
|
|
|
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
|
|
|
|
|
|
The message channel inputing the monitor uses ZeroMQ as main message framework
|
|
|
used by backend. This decision keeps rest of backend aware of used
|
|
|
communication protocol and related libraries. Output channel is WebSocket as a
|
|
|
protocol for sending messages to web browsers. In Python, there are several
|
|
|
WebSocket libraries. The most popular one is `websockets` in cooperation with
|
|
|
`asyncio`. This combination is easy to use and well documented, so it is used in
|
|
|
monitor component too. For ZeroMQ, there is `zmq` library with binding to
|
|
|
framework core in C++.
|
|
|
|
|
|
Incoming messages are cached for short period of time. Early testing shows,
|
|
|
that backend can start sending progress messages sooner than client connects to
|
|
|
the monitor. To solve this, messages for each job are hold 5 minutes after
|
|
|
reception of last message. The client gets all already received messages at time
|
|
|
of connection with no message loss.
|
|
|
|
|
|
### API server
|
|
|
|
|
|
The API server must handle HTTP requests and manage the state of the application
|
|
|
in some kind of a database. It must also be able to communicate with the
|
|
|
backend over ZeroMQ.
|
|
|
|
|
|
We considered several technologies which could be used:
|
|
|
|
|
|
- PHP + Apache -- one of the most widely used technologies for creating web
|
|
|
servers. It is a suitable technology for this kind of a project. It has all
|
|
|
the features we need when some additional extensions are installed (to support
|
|
|
LDAP or ZeroMQ).
|
|
|
- Ruby on Rails, Python (Django), etc. -- popular web technologies that appeared
|
|
|
in the last decade. Both support ZeroMQ and LDAP via extensions and have large
|
|
|
developer communities.
|
|
|
- ASP.NET (C#), JSP (Java) -- these technologies are very robust and are used to
|
|
|
create server technologies in many big enterprises. Both can run on Windows
|
|
|
and Linux servers (ASP.NET using the .NET Core).
|
|
|
- JavaScript (Node.js) -- it is a quite new technology and it is being used to
|
|
|
create REST APIs lately. Applications running on Node.js are quite performant
|
|
|
and the number of open-source libraries available on the Internet is very
|
|
|
huge.
|
|
|
|
|
|
We chose PHP and Apache mainly because we were familiar with these technologies
|
|
|
and we were able to develop all the features we needed without learning to use a
|
|
|
new technology. Since the number of features was quite high and needed to meet a
|
|
|
strict deadline. This does not mean that we would find all the other
|
|
|
technologies superior to PHP in all other aspects - PHP 7 is a mature language
|
|
|
with a huge community and a wide range of tools, libraries, and frameworks.
|
|
|
|
|
|
We decided to use an ORM framework to manage the database, namely the widely
|
|
|
used PHP ORM Doctrine 2. Using an ORM tool means we do not have to write SQL
|
|
|
queries by hand. Instead, we work with persistent objects, which provides a
|
|
|
higher level of abstraction. Doctrine also has a robust database abstraction
|
|
|
layer so the database engine is not very important and it can be changed without
|
|
|
any need for changing the code. MariaDB was chosen as the storage backend.
|
|
|
|
|
|
To speed up the development process of the PHP server application we decided to
|
|
|
use a web framework. After evaluating and trying several frameworks, such as
|
|
|
Lumen, Laravel, and Symfony, we ended up using Nette. This framework is very
|
|
|
common in Czech Republic -- its lead developer is a well-known Czech programmer
|
|
|
David Grudl -- and we were already familiar with the patterns used in this
|
|
|
framework, such as dependency injection, authentication, routing. These concepts
|
|
|
are useful even when developing a REST application, which might be a surprise
|
|
|
considering that Nette focuses on "traditional" web applications. There is also
|
|
|
a Nette extension which makes integration of Doctrine 2 very straightforward.
|
|
|
|
|
|
#### Request handling
|
|
|
|
|
|
A typical scenario for handling an API request is matching the HTTP request with
|
|
|
a corresponding handler routine which creates a response object, that is then
|
|
|
sent back to the client, encoded with JSON. The `Nette\Application` package can
|
|
|
be used to achieve this with Nette, although it is meant to be used mainly in
|
|
|
MVP applications.
|
|
|
|
|
|
Matching HTTP requests with handlers can be done using standard Nette URL
|
|
|
routing -- we will create a Nette route for each API endpoint. Using the routing
|
|
|
mechanism from Nette logically leads to implementing handler routines as Nette
|
|
|
Presenter actions. Each presenter should serve logically related endpoints.
|
|
|
|
|
|
The last step is encoding the response as JSON. In `Nette\Application`, HTTP
|
|
|
responses are returned using the `Presenter::sendResponse()` method. We decided
|
|
|
to write a method that calls `sendResponse` internally and takes care of the
|
|
|
encoding. This method has to be called in every presenter action. An alternative
|
|
|
approach would be using the internal payload object of the presenter, which is
|
|
|
more convenient, but provides us with less control.
|
|
|
|
|
|
#### Authentication
|
|
|
|
|
|
Because Nette is focused on building web applications that render a new
|
|
|
page for (almost) every request, it uses PHP sessions (based on cookies) for
|
|
|
authentication. This method is unsuitable for REST APIs where clients do not
|
|
|
typically store cookies. However, it is common that RESTful services provide
|
|
|
access tokens that are then sent with every request by the client.
|
|
|
|
|
|
JWT (JSON web tokens), an open standard for access tokens, was chosen for our
|
|
|
authentication implementation. Support libraries exist for all major languages
|
|
|
used in web developments that facilitate straightforward usage. The tokens use
|
|
|
asymmetric cryptography for signing, which provides a satisfactory level of
|
|
|
security.
|
|
|
|
|
|
To implement JWT in Nette, we have to implement some of its security-related
|
|
|
interfaces such as IAuthenticator and IUserStorage, which is rather easy
|
|
|
thanks to the simple authentication flow. Replacing these services in a Nette
|
|
|
application is also straightforward, thanks to its dependency injection
|
|
|
container implementation.
|
|
|
|
|
|
#### Uploading files
|
|
|
|
|
|
There are two cases when users need to upload files using the API -- submitting
|
|
|
solutions to an assignment and creating a new exercise. In both of these cases,
|
|
|
the final destination of the files is the fileserver. However, the fileserver is
|
|
|
not publicly accessible, so the files have to be uploaded through the API.
|
|
|
|
|
|
The files can be either forwarded to the fileserver directly, without any
|
|
|
interference from the API server, or stored and forwarded later. We chose the
|
|
|
second approach, which is harder to implement, but more convenient -- it lets
|
|
|
exercise authors double-check what they upload to the fileserver and solutions
|
|
|
to assignments can be uploaded in a single request, which makes it easy for the
|
|
|
fileserver to create an archive of the solution files.
|
|
|
|
|
|
#### Permissions
|
|
|
|
|
|
In a system storing user data has to be implemented some kind of permission
|
|
|
checking. Previous chapters implies, that each user has to have a role, which
|
|
|
corresponds to his/her privileges. Our research showed, that three roles are
|
|
|
sufficient -- student, supervisor and administrator. The user role has to be
|
|
|
checked with every request. The good points is, that roles nicely match with
|
|
|
granuality of API endpoints, so the permission checking can be done at the
|
|
|
beginning of each request. That is implemented using PHP annotations, which
|
|
|
allows to specify allowed user roles for each request with very little of code,
|
|
|
but all the business logic is the same, together in one place.
|
|
|
|
|
|
However, roles cannot cover all cases. For example, if user is a supervisor, it
|
|
|
relates only to groups, where he/she is a supervisor. But using only roles
|
|
|
allows him/her to act as supervisor in all groups in the system. Unfortunately,
|
|
|
this cannot be easily fixed using some annotations, because there are many
|
|
|
different cases when this problem occurs. To fix that, some additional checks
|
|
|
can be performed at the beginning of request processing. Usually it is only one
|
|
|
or two simple conditions.
|
|
|
|
|
|
With this two concepts together it is possible to easily cover all cases of
|
|
|
permission checking with quite a small amount of code.
|
|
|
|
|
|
#### Solution loading
|
|
|
|
|
|
When a solution evaluation on backend is finished, results are saved to
|
|
|
fileserver and API is notified by broker about this state. Then some further
|
|
|
steps needs to be done before the results can be presented to users. For
|
|
|
example, these steps are parsing of the results, computation of score or saving
|
|
|
structured data into database. There are two main possibilities of loading:
|
|
|
|
|
|
- immediately
|
|
|
- on demand
|
|
|
|
|
|
They are almost equal, none of them provides any kind of big advantage. Loading
|
|
|
solutions immediately is better, because fetching results from client for the
|
|
|
first time can be a bit faster. On the other hand, loading them on demand when
|
|
|
they are requested for the first time can save some resources when the solution
|
|
|
results are not important and no one is interested in them.
|
|
|
|
|
|
From this choice, we picked up lazy loading when the results are requested.
|
|
|
However, concept of asynchronous jobs is then introduced. This type of job is
|
|
|
useful for batch submitting of jobs, for example rerunning jobs which failed on
|
|
|
worker hardware issue. These jobs are typically submitted by different user than
|
|
|
author (administrator for example), so original authors should be notified. In
|
|
|
this case it is more reasonable to load the results immediately and optionally
|
|
|
send them with the notification. Exactly this is also done, special asynchronous
|
|
|
jobs are loaded immediately with email notification to the original job author.
|
|
|
|
|
|
From a short time distance it seems that immediate loading of all jobs could
|
|
|
simplify loading code and has no major drawbacks. In the next version of ReCodEx
|
|
|
we will rethink this decision.
|
|
|
|
|
|
#### Backend management
|
|
|
|
|
|
Considering the fact that we have backend as a separate component which has no
|
|
|
clue about administrators and uses only logging as some kind of failure
|
|
|
reporting. It can be handy to provide this functionality to backend from
|
|
|
frontend which manages users. The simplest solution would be again to have
|
|
|
separate component with some sort of public interface. It can be for example
|
|
|
REST or some other communication which backend can handle. Functionality of this
|
|
|
kind of component is then quite easy. When request for report arrives from
|
|
|
backend then type is inferred and if it is error which deserves attention of
|
|
|
administrator then email is sent to him/her. There can also be errors which are
|
|
|
not that important, was somehow solved by backend itself or are only
|
|
|
informative, these do not have to be reported by email but only stored in
|
|
|
persistent database for further consideration. On top of that separate component
|
|
|
can be internal and not exposed to outside network. Disadvantage is that
|
|
|
database layer which is used in some particular API instance cannot be used here
|
|
|
because multiple instances of API can use one backend.
|
|
|
|
|
|
Another solution which was at the end implemented is to integrate backend
|
|
|
failure reporting feature to API. Problem with previous one is that if job
|
|
|
execution fails backend has to report this error to some particular API server
|
|
|
from which request for evaluation came. This information is essential and has to
|
|
|
be stored there and not in some general component and general error database.
|
|
|
Obviously if there are multiple API servers connected to one backend there has
|
|
|
to be some API server configured in backend as the main one which receives
|
|
|
reports about general backend errors which are not connected to jobs. This
|
|
|
solution was chosen because as stated we have to implement job error reporting
|
|
|
in API and having separate component only for general errors is not feasible. In
|
|
|
the end error reporting should be available under different route which is
|
|
|
secured by basic HTTP authentication, because basic authentication is easy
|
|
|
enough to implement in low-level backend components. That also means this
|
|
|
feature is visible and can be exploited but from our points of view it seems as
|
|
|
appropriate compromise in simplicity.
|
|
|
|
|
|
Next thing relating backend management is storing its current state. This namely
|
|
|
concerns which workers are available for processing with what hardware and which
|
|
|
languages can be used in exercises. Another step is overall backend state like
|
|
|
how many jobs were processed on some particular worker, workload of broker and
|
|
|
workers, etc. The easiest solution is to manage these information by hand, every
|
|
|
instance of API has to have administrator which would have to fill them. This of
|
|
|
course includes only currently available workers and runtime environments,
|
|
|
backend statistics cannot be provided this way.
|
|
|
|
|
|
Better solution is to let these information update automatically. This can be
|
|
|
done two ways either it can be provided by backend on-demand if API needs them
|
|
|
or backend will send these information periodically to API. Things like
|
|
|
currently available workers or environments are better to be really up-to-date
|
|
|
so this can provided on-demand if needed. Backend statistics are not necessary
|
|
|
stuff which can be updated periodically. But it really depends on the period of
|
|
|
updates, if it is short enough then even available workers, etc. could be
|
|
|
updated this way and be quite up-to-date. However due to lack of time automatic
|
|
|
refreshing of backend state will not be implemented in early versions but might
|
|
|
be implemented in next releases.
|
|
|
|
|
|
### Web-app
|
|
|
|
|
|
@todo: what technologies can be used on client frontend side, why react was used
|
|
|
|
|
|
@todo: please think about more stuff about api and web-app... thanks ;-)
|
|
|
|
|
|
|
|
|
|
|
|
# The Backend
|
|
|
|
|
|
The backend is the part which is hidden to the user and which has only
|
|
|
one purpose: evaluate user’s solutions of their assignments.
|
|
|
|
|
|
@todo: describe the configuration inputs of the Backend
|
|
|
|
|
|
@todo: describe the outputs of the Backend
|
|
|
|
|
|
@todo: describe how the backend receives the inputs and how it
|
|
|
communicates the results
|
|
|
|
|
|
## Components
|
|
|
|
|
|
Whole backend is not just one service/component, it is quite complex system on its own.
|
|
|
|
|
|
@todo: describe the inner parts of the Backend (and refer to the Wiki
|
|
|
for the technical description of the components)
|
|
|
|
|
|
### Broker
|
|
|
|
|
|
@todo: gets stuff done, single point of failure and center point of ReCodEx universe
|
|
|
|
|
|
### Fileserver
|
|
|
|
|
|
@todo: stores particular data from frontend and backend, hashing, HTTP API
|
|
|
|
|
|
### Worker
|
|
|
|
|
|
@todo: describe a bit of internal structure in general
|
|
|
|
|
|
@todo: describe how jobs are generally executed
|
|
|
|
|
|
### Monitor
|
|
|
|
|
|
@todo: not necessary component which can be omitted, proxy-like service
|
|
|
|
|
|
## Backend internal communication
|
|
|
|
|
|
@todo: internal backend communication, what communicates with what and why
|
|
|
|
|
|
The Frontend
|
|
|
============
|
|
|
|
|
|
The frontend is the part which is visible to the user of ReCodEx and
|
|
|
which holds the state of the system – the user accounts, their roles in
|
|
|
the system, the database of exercises, the assignments of these
|
|
|
exercises to groups of users (i.e., students), and the solutions and
|
|
|
evaluations of them.
|
|
|
|
|
|
Frontend is split into three parts:
|
|
|
|
|
|
- the server-side REST API (“API”) which holds the business logic and
|
|
|
keeps the state of the system consistent
|
|
|
|
|
|
- the relational database (“DB”) which persists the state of the
|
|
|
system
|
|
|
|
|
|
- the client side application (“client”) which simplifies access to
|
|
|
the API for the common users
|
|
|
|
|
|
The centerpiece of this architecture is the API. This component receives
|
|
|
requests from the users and from the Backend, validates them and
|
|
|
modifies the state of the system and persists this modified state in the
|
|
|
DB.
|
|
|
|
|
|
We have created a web application which can communicate with the API
|
|
|
server and present the information received from the server to the user
|
|
|
in a convenient way. The client can be though any application, which can
|
|
|
send HTTP requests and receive the HTTP responses. Users can use general
|
|
|
applications like [cURL](https://github.com/curl/curl/),
|
|
|
[Postman](https://www.getpostman.com/), or create their own specific
|
|
|
client for ReCodEx API.
|
|
|
|
|
|
Frontend capabilities
|
|
|
---------------------
|
|
|
|
|
|
@todo: describe what the frontend is capable of and how it really works,
|
|
|
what are the limitations and how it can be extended
|
|
|
|
|
|
Terminology
|
|
|
-----------
|
|
|
|
|
|
This project was created for the needs of a university and this fact is
|
|
|
reflected into the terminology used throughout the Frontend. A list of
|
|
|
important terms’ definitions follows to make the meaning unambiguous.
|
|
|
|
|
|
### User and user roles
|
|
|
|
|
|
*User* is a person who uses the application. User is granted access to
|
|
|
the application once he or she creates an account directly through the
|
|
|
API or the web application. There are several types of user accounts
|
|
|
depending on the set of permissions – a so called “role” – they have
|
|
|
been granted. Each user receives only the most basic set of permissions
|
|
|
after he or she creates an account and this role can be changed only by
|
|
|
the administrators of the service:
|
|
|
|
|
|
- *Student* is the most basic role. Student can become member of a
|
|
|
group and submit his solutions to his assignments.
|
|
|
|
|
|
- *Supervisor* can be entitled to manage a group of students.
|
|
|
Supervisor can assign exercises to the students who are members of
|
|
|
his groups and review their solutions submitted to
|
|
|
these assignments.
|
|
|
|
|
|
- *Super-admin* is a user with unlimited rights. This user can perform
|
|
|
any action in the system.
|
|
|
|
|
|
There are two implicit changes of roles:
|
|
|
|
|
|
- Once a *student* is added to a group as its supervisor, his role is
|
|
|
upgraded to a *supervisor* role.
|
|
|
|
|
|
- Once a *supervisor* is removed from the lasts group where he is a
|
|
|
supervisor then his role is downgraded to a *student* role.
|
|
|
|
|
|
These mechanisms do not prevent a single user being a supervisor of one
|
|
|
group and student of a different group as supervisors’ permissions are
|
|
|
superset of students’ permissions.
|
|
|
|
|
|
### Login
|
|
|
|
|
|
*Login* is a set of user’s credentials he must submit to verify he can
|
|
|
be allowed to access the system as a specific user. We distinguish two
|
|
|
types of logins: local and external.
|
|
|
|
|
|
- *Local login* is user’s email address and a password he chooses
|
|
|
during registration.
|
|
|
|
|
|
- *External login* is a mapping of a user profile to an account of
|
|
|
some authentication service (e.g., [CAS](https://ldap1.cuni.cz/)).
|
|
|
|
|
|
### Instance
|
|
|
|
|
|
*An instance* of ReCodEx is in fact just a set of groups and user
|
|
|
accounts. An instance should correspond to a real entity as a
|
|
|
university, a high-school, an IT company or an HR agency. This approach
|
|
|
enables the system to be shared by multiple independent organizations
|
|
|
without interfering with each other.
|
|
|
|
|
|
Usage of the system by the users of an instance can be limited by
|
|
|
possessing a valid license. It is up to the administrators of the system
|
|
|
to determine the conditions under which they will assign licenses to the
|
|
|
instances.
|
|
|
|
|
|
### Group
|
|
|
|
|
|
*Group* corresponds to a school class or some other unit which gathers
|
|
|
users who will be assigned the same set exercises. Each group can have
|
|
|
multiple supervisors who can manage the students and the list of
|
|
|
assignments.
|
|
|
|
|
|
Groups can form a tree hierarchy of arbitrary depth. This is inspired by the
|
|
|
hierarchy of school classes belonging to the same subject over several school
|
|
|
years. For example, there can be a top level group for a programming class that
|
|
|
contains subgroups for every school year. These groups can then by divided into
|
|
|
actual student groups with respect to lab attendance. Supervisors can create
|
|
|
subgroups of their groups and further manage these subgroups.
|
|
|
|
|
|
### Exercise
|
|
|
|
|
|
*An exercise* consists of textual assignment of a task and a definition
|
|
|
of how a solution to this exercise should be processed and evaluated in
|
|
|
a specific runtime environment (i.e., how to compile a submitted source
|
|
|
code and how to test the correctness of the program). It is a template
|
|
|
which can be instantiated as an *assignment* by a supervisor of a group.
|
|
|
|
|
|
### Assignment
|
|
|
|
|
|
An assignment is an instance of an *exercise* assigned to a specific
|
|
|
*group*. An assignment can modify the text of the task assignment and it
|
|
|
has some additional information which is specific to the group (e.g., a
|
|
|
deadline, the number of points gained for a correct solution, additional
|
|
|
hints for the students in the assignment). The text of the assignment
|
|
|
can be edited and supervisors can translate the assignment into another
|
|
|
language.
|
|
|
|
|
|
### Solution
|
|
|
|
|
|
*A solution* is a set of files which a user submits to a given
|
|
|
*assignment*.
|
|
|
|
|
|
### Submission
|
|
|
|
|
|
*A submission* corresponds to a *solution* being evaluated by the
|
|
|
Backend. A single *solution* can be submitted repeatedly (e.g., when the
|
|
|
Backend encounters an error or when the supervisor changes the assignment).
|
|
|
|
|
|
### Evaluation
|
|
|
|
|
|
*An evaluation* is the processed report received from the Backend after
|
|
|
a *submission* is processed. Evaluation contains points given to the
|
|
|
user based on the quality of his solution measured by the Backend and
|
|
|
the settings of the assignment. Supervisors can review the evaluation
|
|
|
and add bonus points (both positive and negative) if the student
|
|
|
deserves some.
|
|
|
|
|
|
### Runtime environment
|
|
|
|
|
|
*A runtime environment* defines the used programming language or tools
|
|
|
which are needed to process and evaluate a solution. Examples of a
|
|
|
runtime environment can be:
|
|
|
|
|
|
- *Linux + GCC*
|
|
|
- *Linux + Mono*
|
|
|
- *Windows + .NET 4*
|
|
|
- *Bison + Yacc*
|
|
|
|
|
|
### Limits
|
|
|
|
|
|
A correct *solution* of an *assignment* has to pass all specified tests (mostly
|
|
|
checks that it yields the correct output for various inputs) and typically must
|
|
|
also be effective in some sense. The Backend measures the time and memory
|
|
|
consumption of the solution while running. This consumption of resources can be
|
|
|
*limited* and the solution will receive fewer points if it exceeds the given
|
|
|
limits in some test cases defined by the *exercise*.
|
|
|
|
|
|
User management
|
|
|
---------------
|
|
|
|
|
|
@todo: roles and their rights, adding/removing different users, how the
|
|
|
role of a specific user changes
|
|
|
|
|
|
Instances and hierarchy of groups
|
|
|
---------------------------------
|
|
|
|
|
|
@todo: What is an instance, how to create one, what are the licenses and
|
|
|
how do they work. Why can the groups form hierarchies and what are the
|
|
|
benefits – what it means to be an admin of a group, hierarchy of roles
|
|
|
in the group hierarchy.
|
|
|
|
|
|
Exercises database
|
|
|
------------------
|
|
|
|
|
|
@todo: How the exercises are stored, accessed, who can edit what
|
|
|
|
|
|
### Creating a new exercise
|
|
|
|
|
|
@todo Localized assignments, default settings
|
|
|
|
|
|
### Runtime environments and hardware groups
|
|
|
|
|
|
@todo read this later and see if it still makes sense
|
|
|
|
|
|
ReCodEx is designed to utilize a rather diverse set of workers -- there can be
|
|
|
differences in many aspects, such as the actual hardware running the worker
|
|
|
(which impacts the results of measuring) or installed compilers, interpreters
|
|
|
and other tools needed for evaluation. To address these two examples in
|
|
|
particular, we assign runtime environments and hardware groups to exercises.
|
|
|
|
|
|
The purpose of runtime environments is to specify which tools (and often also
|
|
|
operating system) are required to evaluate a solution of the exercise -- for
|
|
|
example, a C# programming exercise can be evaluated on a Linux worker running
|
|
|
Mono or a Windows worker with the .NET runtime. Such exercise would be assigned
|
|
|
two runtime environments, `Linux+Mono` and `Windows+.NET` (the environment names
|
|
|
are arbitrary strings configured by the administrator).
|
|
|
|
|
|
A hardware group is a set of workers that run on similar hardware (e.g. a
|
|
|
particular quad-core processor model and a SSD hard drive). Workers are assigned
|
|
|
to these groups by the administrator. If this is done correctly, performance
|
|
|
measurements of a submission should yield the same results. Thanks to this fact,
|
|
|
we can use the same resource limits on every worker in a hardware group.
|
|
|
However, limits can differ between runtime environments -- formally speaking,
|
|
|
limits are a function of three arguments: an assignment, a hardware group and a
|
|
|
runtime environment.
|
|
|
|
|
|
### Reference solutions
|
|
|
|
|
|
@todo: how to add one, how to evaluate it
|
|
|
|
|
|
The task of determining appropriate resource limits for exercises is difficult
|
|
|
to do correctly. To aid exercise authors and group supervisors, ReCodEx supports
|
|
|
assigning reference solutions to exercises. Those are example programs that
|
|
|
should cover the main approaches to the implementation. For example, searching
|
|
|
for an integer in an ordered array can be done with a linear search, or better,
|
|
|
using a binary search.
|
|
|
|
|
|
Reference solutions can be evaluated on demand, using a selected hardware group.
|
|
|
The evaluation results are stored and can be used later to determine limits. In
|
|
|
our example problem, we could configure the limits so that the linear
|
|
|
search-based program doesn't finish in time on larger inputs, but a binary
|
|
|
search does.
|
|
|
|
|
|
Note that separate reference solutions should be supplied for all supported
|
|
|
runtime environments.
|
|
|
|
|
|
### Exercise assignments
|
|
|
|
|
|
@todo: Creating instances of an exercise for a specific group of users,
|
|
|
capabilities of settings. Editing limits according to the reference
|
|
|
solution.
|
|
|
|
|
|
Evaluation process
|
|
|
------------------
|
|
|
|
|
|
@todo: How the evaluation process works on the Frontend side.
|
|
|
|
|
|
### Uploading files and file storage
|
|
|
|
|
|
@todo: One by one upload endpoint. Explain different types of the
|
|
|
Uploaded files.
|
|
|
|
|
|
### Automatic detection of the runtime environment
|
|
|
|
|
|
@todo: Users must submit correctly named files – assuming the RTE from
|
|
|
the extensions.
|
|
|
|
|
|
REST API implementation
|
|
|
-----------------------
|
|
|
|
|
|
@todo: What is the REST API, what are the basic principles – GET, POST,
|
|
|
Headers, JSON.
|
|
|
|
|
|
### Authentication and authorization scopes
|
|
|
|
|
|
@todo: How authentication works – signed JWT, headers, expiration,
|
|
|
refreshing. Token scopes usage.
|
|
|
|
|
|
### HTTP requests handling
|
|
|
|
|
|
@todo: Router and routes with specific HTTP methods, preflight, required
|
|
|
headers
|
|
|
|
|
|
### HTTP responses format
|
|
|
|
|
|
@todo: Describe the JSON structure convention of success and error
|
|
|
responses
|
|
|
|
|
|
### Used technologies
|
|
|
|
|
|
@todo: PHP7 – how it is used for typehints, Nette framework – how it is
|
|
|
used for routing, Presenters actions endpoints, exceptions and
|
|
|
ErrorPresenter, Doctrine 2 – database abstraction, entities and
|
|
|
repositories + conventions, Communication over ZMQ – describe the
|
|
|
problem with the extension and how we reported it and how to treat it in
|
|
|
the future when the bug is solved. Relational database – we use MariaDB,
|
|
|
Doctine enables us to switch the engine to a different engine if needed
|
|
|
|
|
|
### Data model
|
|
|
|
|
|
@todo: Describe the code-first approach using the Doctrine entities, how
|
|
|
the entities map onto the database schema (refer to the attached schemas
|
|
|
of entities and relational database models), describe the logical
|
|
|
grouping of entities and how they are related:
|
|
|
|
|
|
- user + settings + logins + ACL
|
|
|
- instance + licenses + groups + group membership
|
|
|
- exercise + assignments + localized assignments + runtime
|
|
|
environments + hardware groups
|
|
|
- submission + solution + reference solution + solution evaluation
|
|
|
- comment threads + comments
|
|
|
|
|
|
### API endpoints
|
|
|
|
|
|
@todo: Tell the user about the generated API reference and how the
|
|
|
Swagger UI can be used to access the API directly.
|
|
|
|
|
|
Web Application
|
|
|
---------------
|
|
|
|
|
|
@todo: What is the purpose of the web application and how it interacts
|
|
|
with the REST API.
|
|
|
|
|
|
### Used technologies
|
|
|
|
|
|
@todo: Briefly introduce the used technologies like React, Redux and the
|
|
|
build process. For further details refer to the GitHub wiki
|
|
|
|
|
|
### How to use the application
|
|
|
|
|
|
@todo: Describe the user documentation and the FAQ page.
|
|
|
|
|
|
Backend-Frontend communication protocol
|
|
|
=======================================
|
|
|
|
|
|
@todo: describe the exact methods and respective commands for the
|
|
|
communication
|
|
|
|
|
|
Initiation of a job evaluation
|
|
|
------------------------------
|
|
|
|
|
|
@todo: How does the Frontend initiate the evaluation and how the Backend
|
|
|
can accept it or decline it
|
|
|
|
|
|
Job processing progress monitoring
|
|
|
----------------------------------
|
|
|
|
|
|
When evaluating a job the worker sends progress messages on predefined points of
|
|
|
evaluation chain. The sending place can be on very beginning of the job, when
|
|
|
submit archive is downloaded or at the end of each simple task with its state
|
|
|
(completed, failed, skipped). These messages are sent to broker through existing
|
|
|
ZeroMQ connection. Detailed format of messages can be found on [communication
|
|
|
page](https://github.com/ReCodEx/wiki/wiki/Overall-architecture#commands-from-worker-to-broker).
|
|
|
|
|
|
Broker only resends received progress messages to the monitor component via
|
|
|
ZeroMQ socket. The output message format is the same as the input format.
|
|
|
|
|
|
Monitor parses received messages to JSON format, which is easy to work with in
|
|
|
JavaScript inside web application. All messages are cached (one queue per job)
|
|
|
and can be obtained multiple times through WebSocket communication channel. The
|
|
|
cache is cleared 5 minutes after receiving last message.
|
|
|
|
|
|
Publishing of the results
|
|
|
-------------------------
|
|
|
|
|
|
After job finish the worker packs results directory into single archive and
|
|
|
uploads it to the fileserver through HTTP protocol. The target URL is obtained
|
|
|
from API in headers on job initiation. Then "job done" notification request is
|
|
|
performed to API via broker. Special submissions (reference or asynchronous
|
|
|
submissions) are loaded immediately, other types are loaded on-demand on first
|
|
|
results request.
|
|
|
|
|
|
Loading results means fetching archive from fileserver, parsing the main YAML
|
|
|
file generated by worker and saving data to the database. Also, points are
|
|
|
assigned by score calculator.
|
|
|
|
|
|
|
|
|
# User documentation
|
|
|
|
|
|
@todo: Describe different scenarios of the usage of the Web App
|
|
|
|
|
|
@todo: Describe the requirements of running the web application (modern web browser, enabled CSS, JavaScript, Cookies & Local storage)
|
|
|
|
|
|
## Terminology
|
|
|
|
|
|
@todo: Describe the terminology: Instance, User, Group, Student,
|
|
|
Supervisor, Admin
|
|
|
|
|
|
## General basics
|
|
|
|
|
|
@todo: actions which are available for all users
|
|
|
|
|
|
@todo: how to solve problems with ReCodEx, first supervisors, then administrators, etc...
|
|
|
|
|
|
### First steps in ReCodEx
|
|
|
|
|
|
You can create an account if you click on the “*Create account*” menu
|
|
|
item in the left sidebar. You can choose between two types of
|
|
|
registration methods – by creating a local account with a specific
|
|
|
password, or pairing your new account with an existing CAS UK account.
|
|
|
|
|
|
If you decide a new “*local*” account using the “*Create ReCodEx
|
|
|
account*” form, you will have to provide your details and choose a
|
|
|
password for your account. You will later sign in using your email
|
|
|
address as your username and the password you select.
|
|
|
|
|
|
If you decide to use the CAS UK, then we will verify your credentials
|
|
|
and access your name and email stored in the system and create your
|
|
|
account based on this information. You can change your personal
|
|
|
information or email later on the “*Settings*” page.
|
|
|
|
|
|
When creating your account both ways, you must select an instance your
|
|
|
account will belong to by default. The instance you will select will be
|
|
|
most likely your university or other organization you are a member of.
|
|
|
|
|
|
To log in, go to the homepage of ReCodEx and in the left sidebar choose
|
|
|
the menu item “*Sign in*”. Then you must enter your credentials into one
|
|
|
of the two forms – if you selected a password during registration, then
|
|
|
you should sign with your email and password in the first form called
|
|
|
“*Sign into ReCodEx*”. If you registered using the Charles University
|
|
|
Authentication Service (CAS), you should put your student’s number and
|
|
|
your CAS password into the second form called “Sign into ReCodEx using
|
|
|
CAS UK”.
|
|
|
|
|
|
There are several options you can edit in your user account:
|
|
|
|
|
|
- changing your personal information (i.e., name)
|
|
|
- changing your credentials (email and password)
|
|
|
- updating your preferences (e.g., source code viewer/editor settings,
|
|
|
default language)
|
|
|
|
|
|
You can access the settings page through the “*Settings*” button right
|
|
|
under your name in the left sidebar.
|
|
|
|
|
|
If you don’t use ReCodEx for a whole day, you will be logged out
|
|
|
automatically. However, we recommend you sign out of the application
|
|
|
after you finished your interaction with it. The logout button is placed
|
|
|
in the top section of the left sidebar right under your name. You will
|
|
|
have to expand the sidebar with a button next to the “*ReCodEx*” title
|
|
|
(shown in the picture below).
|
|
|
|
|
|
### Forgotten password
|
|
|
|
|
|
If you can’t remember your password and you don’t use CAS UK
|
|
|
authentication, then you can reset your password. You will find a link
|
|
|
saying “*You cannot remember what your password was? Reset your
|
|
|
password.*” under the sign in form. After you click on this link, you
|
|
|
will be asked to submit your email address. An email with a link
|
|
|
containing a special token will be sent to the address you fill in. We
|
|
|
make sure that the person who requested password resetting is really
|
|
|
you. When you click on the link (or you copy & paste it into your web
|
|
|
browser) you will be able to select a new password for your account. The
|
|
|
token is valid only for a couple of minutes, so do not forget to reset
|
|
|
the password as soon as possible, or you will have to request a new link
|
|
|
with a valid token.
|
|
|
|
|
|
If you sign in through CAS UK, then please follow the instructions
|
|
|
provided by the administrators of the service described on their
|
|
|
website.
|
|
|
|
|
|
|
|
|
## Student
|
|
|
|
|
|
@todo: describe what it means to be a “student” and what are the
|
|
|
student’s rights
|
|
|
|
|
|
### Join group and start solving assignments
|
|
|
|
|
|
@todo: How to join a specific group
|
|
|
|
|
|
@todo: Where can the user see groups description and details, what
|
|
|
information is available.
|
|
|
|
|
|
@todo: Where the student can find the list of the assignment he is
|
|
|
expected to solve, what is the first and second deadline.
|
|
|
|
|
|
@todo: How does a student submit his solution through the web app
|
|
|
|
|
|
@todo: When the results are ready and what the results mean and what to
|
|
|
do about them, when the user is convinced, that his solution is correct
|
|
|
although the results say different
|
|
|
|
|
|
@todo: Describe the comments thread behavior (public/private comments),
|
|
|
who else can see the comments, how notifications work (*not implemented
|
|
|
yet*!).
|
|
|
|
|
|
|
|
|
## Group supervisor
|
|
|
|
|
|
@todo: describe what it means to be a “supervisor” of a group and what
|
|
|
are the supervisors rights
|
|
|
|
|
|
### Create groups and manage them
|
|
|
|
|
|
@todo: How does a user become a supervisor of a group?
|
|
|
|
|
|
@todo: How to add a specific student to a given group
|
|
|
|
|
|
### Assigning exercises
|
|
|
|
|
|
@todo: Describe how to access the database of the exercises and what are
|
|
|
the possibilities of assignment setup – availability, deadlines, points,
|
|
|
score configuration, limits
|
|
|
|
|
|
@todo: How can I assign some exercises only to some students of the group? Describe how to achieve this using subgroups
|
|
|
|
|
|
### Students' solutions management
|
|
|
|
|
|
@todo Describe where all the students’ solutions for a given assignment
|
|
|
can be found, where to look for all solutions of a given student, how to
|
|
|
see results of a specific student’s solution’s evaluation result.
|
|
|
|
|
|
@todo Can I assign points to my students’ solutions manually instead of depending on automatic scoring? If and how to change the score of a solution – assignment
|
|
|
settings, setting points, bonus points, accepting a solution (*not
|
|
|
implemented yet!*). Describe how the student and supervisor will still
|
|
|
be able to see the percentage received from the automatic scoring, but
|
|
|
the awarded points will be overridden.
|
|
|
|
|
|
@todo: Describe the comments thread behavior (public/private comments),
|
|
|
who else can see the comments -- same as from the student perspective
|
|
|
|
|
|
### Creating exercises
|
|
|
|
|
|
@todo: how to create exercise, what has to be provided during creation, who can create exercises
|
|
|
|
|
|
@todo: Describe the form and explain the concept of reference solutions.
|
|
|
How to evaluate the reference solutions for the exercise right now (to
|
|
|
get the up-to-date information).
|
|
|
|
|
|
|
|
|
## Group administrator
|
|
|
|
|
|
@todo: who is this?
|
|
|
|
|
|
### Creating subgroups and managing supervisors
|
|
|
|
|
|
@todo: What it means to create a subgroup and how to do it.
|
|
|
|
|
|
@todo: who can add another supervisor, what would be the rights of the
|
|
|
second supervisor
|
|
|
|
|
|
|
|
|
## Superadministrator
|
|
|
|
|
|
Superadmin is user with the most priviledges and as such superadmin should be
|
|
|
quite unique role. Ideally there should be only one of this kind, used with
|
|
|
special caution and adequate security. With this stated it is obvious that
|
|
|
superadmin can perform any action the API is capable of.
|
|
|
|
|
|
### Users management
|
|
|
|
|
|
There are only few roles to which users can belong in ReCodEx. Basically there
|
|
|
are only three: _student_, _supervisor_, and _superadmin_. Base role is student
|
|
|
which is assigned to every registered user. Roles are stored in database
|
|
|
alongside other information about user. One user always has only one role at the
|
|
|
time. At first startup of ReCodEx administrator should create his account and
|
|
|
then change role in database by hand. After that manual intervention into
|
|
|
database should never be needed.
|
|
|
|
|
|
There is a little catch in groups and instances management. Groups can have
|
|
|
admins and supervisors. This setting is valid only per one particular group and
|
|
|
has to be separated from basic role system. This implies that supervisor in one
|
|
|
group can be student in another and simultaneously have global supervisor role.
|
|
|
Changing role from student to supervisor and back is done automatically by
|
|
|
application and should not be managed by hand in database! Previously stated
|
|
|
information can be applied to instances as well, but instances can only have
|
|
|
admins.
|
|
|
|
|
|
Roles description:
|
|
|
|
|
|
- Student -- Default role which is used for newly created accounts. Student can
|
|
|
join or leave public groups and submit solutions of assigned exercises.
|
|
|
- Supervisor -- Inherits all permissions from student role. Can manage groups to
|
|
|
which he/she belongs to. Supervisor can also view and change groups details,
|
|
|
manage assigned exercises, view students in group and their solutions for
|
|
|
assigned exercises. On top of that supervisor can create/delete groups too,
|
|
|
but only as subgroup of groups he/she belongs to.
|
|
|
- Superadmin -- Inherits all permissions from supervisor role. Most powerful
|
|
|
user in ReCodEx who should be able to do everything which is provided by
|
|
|
application.
|
|
|
|
|
|
|
|
|
## Writing job configuration
|
|
|
|
|
|
To run and evaluate an exercise the backend needs to know the steps how to do
|
|
|
that. This is different for each environment (operation system, programming
|
|
|
language, etc.), so each of the environments needs to have separate
|
|
|
configuration.
|
|
|
|
|
|
Backend works with a powerful, but quite low level description of simple
|
|
|
connected tasks written in YAML syntax. More about the syntax and general task
|
|
|
overview can be found on [separate
|
|
|
page](https://github.com/ReCodEx/wiki/wiki/Assignments). One of the planned
|
|
|
features was user friendly configuration editor, but due to tight deadline and
|
|
|
team composition it did not make it to the first release. However, writing
|
|
|
configuration in the basic format will be always available and allows users to
|
|
|
use the full expressive power of the system.
|
|
|
|
|
|
This section walks through creation of job configuration for _hello world_
|
|
|
exercise. The goal is to compile file _source.c_ and check if it prints `Hello
|
|
|
World!` to the standard output. This is the only test case, let's call it
|
|
|
**A**.
|
|
|
|
|
|
The problem can be split into several tasks:
|
|
|
|
|
|
- compile _source.c_ into _helloworld_ with `/usr/bin/gcc`
|
|
|
- run _helloworld_ and save standard output into _out.txt_
|
|
|
- fetch predefined output (suppose it is already uploaded to fileserver) with
|
|
|
hash `a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b` to _reference.txt_
|
|
|
- compare _out.txt_ and _reference.txt_ by `/usr/bin/diff`
|
|
|
|
|
|
The absolute path of tools can be obtained from system administrator. However,
|
|
|
`/usr/bin/gcc` is location, where the GCC binary is available almost everywhere,
|
|
|
so location of some tools can be (professionally) guessed.
|
|
|
|
|
|
First, write header of the job to the configuration file.
|
|
|
|
|
|
```{.yml}
|
|
|
submission:
|
|
|
job-id: hello-word-job
|
|
|
hw-groups:
|
|
|
- group1
|
|
|
```
|
|
|
|
|
|
Basically it means, that the job _hello-world-job_ needs to be run on workers
|
|
|
that belong to the `group_1` hardware group . Reference files are downloaded
|
|
|
from the default location configured in API (such as
|
|
|
`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job
|
|
|
execution log will not be saved to result archive.
|
|
|
|
|
|
Next the tasks have to be constructed under _tasks_ section. In this demo job,
|
|
|
every task depends only on previous one. The first task has input file
|
|
|
_source.c_ (if submitted by user) already available in working directory, so
|
|
|
just call the GCC. Compilation is run in sandbox as any other external program
|
|
|
and should have relaxed time and memory limits. In this scenario, worker
|
|
|
defaults are used. If compilation fails, the whole job is immediately terminated
|
|
|
(because the _fatal-failure_ bit is set). Because _bound-directories_ option in
|
|
|
sandbox limits section is mostly shared between all tasks, it can be set in
|
|
|
worker configuration instead of job configuration (suppose this for following
|
|
|
tasks). For configuration of workers please contact your administrator.
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "compilation"
|
|
|
type: "initiation"
|
|
|
fatal-failure: true
|
|
|
cmd:
|
|
|
bin: "/usr/bin/gcc"
|
|
|
args:
|
|
|
- "source.c"
|
|
|
- "-o"
|
|
|
- "helloworld"
|
|
|
sandbox:
|
|
|
name: "isolate"
|
|
|
limits:
|
|
|
- hw-group-id: group1
|
|
|
chdir: ${EVAL_DIR}
|
|
|
bound-directories:
|
|
|
- src: ${SOURCE_DIR}
|
|
|
dst: ${EVAL_DIR}
|
|
|
mode: RW
|
|
|
```
|
|
|
|
|
|
The compiled program is executed with time and memory limit set and the standard
|
|
|
output is redirected to a file. This task depends on _compilation_ task, because
|
|
|
the program cannot be executed without being compiled first. It is important to
|
|
|
mark this task with _execution_ type, so exceeded limits will be reported in
|
|
|
frontend.
|
|
|
|
|
|
Time and memory limits set directly for a task have higher priority than worker
|
|
|
defaults. One important constraint is, that these limits cannot exceed limits
|
|
|
set by workers. Worker defaults are present as a safety measure so that a
|
|
|
malformed job configuration cannot block the worker forever. Worker default
|
|
|
limits should be reasonably high, like a gigabyte of memory and several hours of
|
|
|
execution time. For exact numbers please contact your administrator.
|
|
|
|
|
|
It is important to know that if the output of a program (both standard and
|
|
|
error) is redirected to a file, the sandbox disk quotas apply to that file, as
|
|
|
well as the files created directly by the program. In case the outputs are
|
|
|
ignored, they are redirected to `/dev/null`, which means there is no limit on
|
|
|
the output length (as long as the printing fits in the time limit).
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "execution_1"
|
|
|
test-id: "A"
|
|
|
type: "execution"
|
|
|
dependencies:
|
|
|
- compilation
|
|
|
cmd:
|
|
|
bin: "helloworld"
|
|
|
sandbox:
|
|
|
name: "isolate"
|
|
|
stdout: ${EVAL_DIR}/out.txt
|
|
|
limits:
|
|
|
- hw-group-id: group1
|
|
|
chdir: ${EVAL_DIR}
|
|
|
time: 0.5
|
|
|
memory: 8192
|
|
|
```
|
|
|
|
|
|
Fetch sample solution from file server. Base URL of file server is in the header
|
|
|
of the job configuration, so only the name of required file (its `sha1sum` in
|
|
|
our case) is necessary.
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "fetch_solution_1"
|
|
|
test-id: "A"
|
|
|
dependencies:
|
|
|
- execution
|
|
|
cmd:
|
|
|
bin: "fetch"
|
|
|
args:
|
|
|
- "a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b"
|
|
|
- "${SOURCE_DIR}/reference.txt"
|
|
|
```
|
|
|
|
|
|
Comparison of results is quite straightforward. It is important to set the task
|
|
|
type to _evaluation_, so that the return code is set to 0 if the program is
|
|
|
correct and 1 otherwise. We do not set our own limits, so the default limits are
|
|
|
used.
|
|
|
|
|
|
```{.yml}
|
|
|
- task-id: "judge_1"
|
|
|
test-id: "A"
|
|
|
type: "evaluation"
|
|
|
dependencies:
|
|
|
- fetch_solution_1
|
|
|
cmd:
|
|
|
bin: "/usr/bin/diff"
|
|
|
args:
|
|
|
- "out.txt"
|
|
|
- "reference.txt"
|
|
|
sandbox:
|
|
|
name: "isolate"
|
|
|
limits:
|
|
|
- hw-group-id: group1
|
|
|
chdir: ${EVAL_DIR}
|
|
|
```
|
|
|
<!---
|
|
|
// vim: set formatoptions=tqn flp+=\\\|^\\*\\s* textwidth=80 colorcolumn=+1:
|
|
|
-->
|
|
|
|