|
|
|
@ -139,19 +139,21 @@ corresponds to his/her privileges. There are user groups reflecting the
|
|
|
|
|
structure of lectured courses.
|
|
|
|
|
|
|
|
|
|
A database of exercises (algorithmic problems) is another part of the project.
|
|
|
|
|
Each exercise consists of a text describing the problem in multiple language
|
|
|
|
|
variants, an evaluation configuration (machine-readable instructions on how to
|
|
|
|
|
evaluate solutions to the exercise) and a set of inputs and reference outputs.
|
|
|
|
|
Exercises are created by instructed privileged users. Assigning an exercise to a
|
|
|
|
|
group means choosing one of the available exercises and specifying additional
|
|
|
|
|
properties: a deadline (optionally a second deadline), a maximum amount of
|
|
|
|
|
points, a configuration for calculating the score, a maximum number of
|
|
|
|
|
submissions, and a list of supported runtime environments (e.g. programming
|
|
|
|
|
languages) including specific time and memory limits for each one.
|
|
|
|
|
Each exercise consists of a text describing the problem (optionally in two
|
|
|
|
|
language variants -- Czech and English), an evaluation configuration
|
|
|
|
|
(machine-readable instructions on how to evaluate solutions to the exercise) and
|
|
|
|
|
a set of inputs and reference outputs. Exercises are created by instructed
|
|
|
|
|
privileged users. Assigning an exercise to a group means choosing one of the
|
|
|
|
|
available exercises and specifying additional properties: a deadline (optionally
|
|
|
|
|
a second deadline), a maximum amount of points, a configuration for calculating
|
|
|
|
|
the score, a maximum number of submissions, and a list of supported runtime
|
|
|
|
|
environments (e.g. programming languages) including specific time and memory
|
|
|
|
|
limits for each one.
|
|
|
|
|
|
|
|
|
|
Typical use cases for supported user roles are following:
|
|
|
|
|
|
|
|
|
|
- **student**
|
|
|
|
|
- create new user account via registration form
|
|
|
|
|
- join a group
|
|
|
|
|
- get assignments in group
|
|
|
|
|
- submit solution to assignment -- upload one source file and trigger
|
|
|
|
@ -180,10 +182,10 @@ students. Concepts of consecutive steps from source code to final results
|
|
|
|
|
is described in more detail below to give readers solid overview of what have to
|
|
|
|
|
happen during evaluation process.
|
|
|
|
|
|
|
|
|
|
First thing users have to do is to submit their solutions through web user
|
|
|
|
|
First thing students have to do is to submit their solutions through web user
|
|
|
|
|
interface. The system checks assignment invariants (deadlines, count of
|
|
|
|
|
submissions, ...) and stores the submitted file. The runtime environment is
|
|
|
|
|
automatically detected based on input file and a suitable evaluation
|
|
|
|
|
submissions, ...) and stores the submitted code. The runtime environment is
|
|
|
|
|
automatically detected based on input file extension and a suitable evaluation
|
|
|
|
|
configuration variant is chosen (one exercise can have multiple variants, for
|
|
|
|
|
example C and Java languages). This exercise configuration is then used for
|
|
|
|
|
taking care of evaluation process.
|
|
|
|
@ -449,20 +451,18 @@ restarted.
|
|
|
|
|
|
|
|
|
|
At this point there is a clear idea how the new system will be used and what are
|
|
|
|
|
the major enhancements for future releases. With this in mind, the overall
|
|
|
|
|
architecture can be sketched. From the previous research, several goals are set
|
|
|
|
|
up for the new project. They mostly reflect drawbacks of the current version of
|
|
|
|
|
CodEx and some reasonable wishes of university users. Most notable features are
|
|
|
|
|
following:
|
|
|
|
|
architecture can be sketched. To sum up, here is a list of key features of the
|
|
|
|
|
new system. They come from previous research of current system's drawbacks,
|
|
|
|
|
reasonable wishes of university users and our major design choices.
|
|
|
|
|
|
|
|
|
|
- modern HTML5 web frontend written in JavaScript using a suitable framework
|
|
|
|
|
- REST API implemented in PHP, communicating with database, evaluation backend
|
|
|
|
|
and a file server
|
|
|
|
|
- REST API communicating with database, evaluation backend and a file server
|
|
|
|
|
- evaluation backend implemented as a distributed system on top of a message
|
|
|
|
|
queue framework (ZeroMQ) with master-worker architecture
|
|
|
|
|
queue framework with master-worker architecture
|
|
|
|
|
- multi-platform worker supporting Linux and Windows environment (latter
|
|
|
|
|
without sandbox, no general purpose suitable tool available yet)
|
|
|
|
|
- evaluation procedure configured in a YAML file, compound of small tasks
|
|
|
|
|
connected into an arbitrary oriented acyclic graph
|
|
|
|
|
- evaluation procedure configured in a human readable text file, compound of
|
|
|
|
|
small tasks connected into an arbitrary oriented acyclic graph
|
|
|
|
|
|
|
|
|
|
The reasons supporting these decisions are explained in the rest of analysis
|
|
|
|
|
chapter. Also a lot of smaller design choices are mentioned including possible
|
|
|
|
@ -532,18 +532,18 @@ is implemented. The relative value is set in percents and is called threshold.
|
|
|
|
|
|
|
|
|
|
Our university has a few partner grammar schools. There were an idea, that they
|
|
|
|
|
could use CodEx for teaching informatics classes. To make the setup simple for
|
|
|
|
|
them, all the software and hardware would be provided by university and hosted
|
|
|
|
|
in their datacentre. However, CodEx were not prepared to support this kind of
|
|
|
|
|
usage and no one had time to manage a separate instance. With ReCodEx it is
|
|
|
|
|
possible to offer hosted environment as a service to other subjects. The concept
|
|
|
|
|
we figured out is based on user and group separation inside the system. There
|
|
|
|
|
are multiple _instances_ in the system, which means unit of separation. Each
|
|
|
|
|
instance has own set of users and groups, exercises can be optionally shared.
|
|
|
|
|
Evaluation backend is common for all instances. To keep track of active
|
|
|
|
|
instances and paying customers, each instance must have a valid _licence_ to
|
|
|
|
|
allow users submit their solutions. licence is granted for defined period of
|
|
|
|
|
time and can be revoked in advance if the subject do not keep approved terms and
|
|
|
|
|
conditions.
|
|
|
|
|
them, all the software and hardware would be provided by the university as a
|
|
|
|
|
completely ready-to-use remote service. However, CodEx were not prepared to
|
|
|
|
|
support this kind of usage and no one had time to manage a separate instance.
|
|
|
|
|
With ReCodEx it is possible to offer hosted environment as a service to other
|
|
|
|
|
subjects. The concept we figured out is based on user and group separation
|
|
|
|
|
inside the system. There are multiple _instances_ in the system, which means
|
|
|
|
|
unit of separation. Each instance has own set of users and groups, exercises can
|
|
|
|
|
be optionally shared. Evaluation backend is common for all instances. To keep
|
|
|
|
|
track of active instances and paying customers, each instance must have a valid
|
|
|
|
|
_licence_ to allow users submit their solutions. licence is granted for defined
|
|
|
|
|
period of time and can be revoked in advance if the subject do not keep approved
|
|
|
|
|
terms and conditions.
|
|
|
|
|
|
|
|
|
|
The main work for the system is to evaluate programming exercises. The exercise
|
|
|
|
|
is quite similar to homework assignment during school labs. When a homework is
|
|
|
|
@ -561,36 +561,6 @@ for every assignment of the same exercise. This separation is natural for all
|
|
|
|
|
users, in CodEx it is implemented in similar way and no other considerable
|
|
|
|
|
solution was found.
|
|
|
|
|
|
|
|
|
|
### Forgotten password
|
|
|
|
|
|
|
|
|
|
With authentication and some sort of dealing with passwords is related a problem
|
|
|
|
|
with forgotten credentials, especially passwords. People easily forget them and
|
|
|
|
|
there has to be some kind of mechanism to retrieve a new password or change the
|
|
|
|
|
old one. Problem is that it cannot be done in totally secure way, but we can at
|
|
|
|
|
least come quite close to it. First, there are absolutely not secure and
|
|
|
|
|
recommendable ways how to handle that, for example sending the old password
|
|
|
|
|
through email. A better, but still not secure solution is to generate a new one
|
|
|
|
|
and again send it through email. This solution was provided in CodEx, users had
|
|
|
|
|
to write an email to administrator, who generated a new password and sent it
|
|
|
|
|
back to the sender. This simple solution could be also automated, but
|
|
|
|
|
administrator had quite a big control over whole process. This might come in
|
|
|
|
|
handy if there could be some additional checkups for example, but on the other
|
|
|
|
|
hand it can be quite time consuming.
|
|
|
|
|
|
|
|
|
|
Probably the best solution which is often used and is fairly secure is
|
|
|
|
|
following. Let us consider only case in which all users have to fill their
|
|
|
|
|
email addresses into the system and these addresses are safely in the hands of
|
|
|
|
|
the right users. When user finds out that he/she does not remember a password,
|
|
|
|
|
he/she requests a password reset and fill in his/her unique identifier; it might
|
|
|
|
|
be email or unique nickname. Based on matched user account the system generates
|
|
|
|
|
unique access token and sends it to user via email address. This token should be
|
|
|
|
|
time limited and usable only once, so it cannot be misused. User then takes the
|
|
|
|
|
token or URL address which is provided in the email and go to the system's
|
|
|
|
|
appropriate section, where new password can be set. After that user can sign in
|
|
|
|
|
with his/her new password. As previously stated, this solution is quite safe and
|
|
|
|
|
user can handle it on its own, so administrator does not have to worry about it.
|
|
|
|
|
That is the main reason why this approach was chosen to be used.
|
|
|
|
|
|
|
|
|
|
### Evaluation unit executed by ReCodEx
|
|
|
|
|
|
|
|
|
|
One of the bigger requests for the new system is to support a complex
|
|
|
|
@ -623,14 +593,23 @@ so no sandbox needs to be used as in external tasks case.
|
|
|
|
|
For a job evaluation, the tasks needs to be executed sequentially in a specified
|
|
|
|
|
order. The idea of running independent tasks in parallel is bad because exact
|
|
|
|
|
time measurement needs controlled environment on target computer with
|
|
|
|
|
minimization of interrupts by other processes. It seems that connecting tasks
|
|
|
|
|
into directed acyclic graph (DAG) can handle all possible problem cases. None of
|
|
|
|
|
the authors, supervisors and involved faculty staff can think of a problem that
|
|
|
|
|
cannot be decomposed into tasks connected in a DAG. The goal of evaluation is
|
|
|
|
|
to satisfy as many tasks as possible. During execution there are sometimes
|
|
|
|
|
multiple choices of next task. To control that, each task can have a priority,
|
|
|
|
|
which is used as a secondary ordering criterion. For better understanding, here
|
|
|
|
|
is a small example.
|
|
|
|
|
minimization of interrupts by other processes. It would be possible to run tasks
|
|
|
|
|
which does not need exact time measuremet in parallel, but in this case a
|
|
|
|
|
synchronization mechanism has to be developed to exclude paralellism for
|
|
|
|
|
measured tasks. Usually, there are about four times more unmeasured tasks than
|
|
|
|
|
tasks with time measurement, but measured tasks tends to be much longer. With
|
|
|
|
|
[Amdahl's law](https://en.wikipedia.org/wiki/Amdahl's_law) in mind, the
|
|
|
|
|
parallelism seems not to provide a huge benefit in overall execution speed and
|
|
|
|
|
brings troubles with synchronization. However, it there will be speed issues,
|
|
|
|
|
this approach could be reconsiderred.
|
|
|
|
|
|
|
|
|
|
It seems that connecting tasks into directed acyclic graph (DAG) can handle all
|
|
|
|
|
possible problem cases. None of the authors, supervisors and involved faculty
|
|
|
|
|
staff can think of a problem that cannot be decomposed into tasks connected in a
|
|
|
|
|
DAG. The goal of evaluation is to satisfy as many tasks as possible. During
|
|
|
|
|
execution there are sometimes multiple choices of next task. To control that,
|
|
|
|
|
each task can have a priority, which is used as a secondary ordering criterion.
|
|
|
|
|
For better understanding, here is a small example.
|
|
|
|
|
|
|
|
|
|
![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png)
|
|
|
|
|
|
|
|
|
@ -639,20 +618,34 @@ _CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should
|
|
|
|
|
be deterministic by position in configuration file -- tasks stated earlier
|
|
|
|
|
should be executed earlier). The task priorities guaranties, that after
|
|
|
|
|
_CompileA_ task all dependent tasks are executed before _CompileB_ task (they
|
|
|
|
|
have higher priority number). For example this is useful to control which files
|
|
|
|
|
are present in a working directory at every moment. To sum up, there are 3
|
|
|
|
|
ordering criteria: dependencies, then priorities and finally position of task in
|
|
|
|
|
configuration. Together, they define a unambiguous linear ordering of all tasks.
|
|
|
|
|
have higher priority number). To sum up, connection of tasks represents
|
|
|
|
|
dependencies and priorities can be used to order unrelated tasks and with this
|
|
|
|
|
provide a total ordering of them. For well written jobs the priorities may not
|
|
|
|
|
be so useful, but they can help control execution order for example to avoid
|
|
|
|
|
situation, where each test of the job generates large temporary file and there
|
|
|
|
|
is a one valid execution order which keeps all the temporary files for later
|
|
|
|
|
processing at one time. Better approach is to finish execution of one test,
|
|
|
|
|
clean the big temporary file and proceed with following test. If there is an
|
|
|
|
|
ambiguity in task ordering at this point, they are executed in order of input
|
|
|
|
|
task configuration.
|
|
|
|
|
|
|
|
|
|
The total linear ordering of tasks can be done easier with just executing them
|
|
|
|
|
in order of input configuration. But this structure cannot handle well cases,
|
|
|
|
|
when a task fails. There is not a easy and nice way how to tell which task
|
|
|
|
|
should be executed next. However, this issue can be solved with graph structured
|
|
|
|
|
dependencies of the tasks. In graph structure, it is clear that all dependent
|
|
|
|
|
tasks has to be skipped and continue execution with a non related task. This is
|
|
|
|
|
the main reason, why the tasks are connected in a DAG.
|
|
|
|
|
|
|
|
|
|
For grading there are several important tasks. First, tasks executing submitted
|
|
|
|
|
code need to be checked for time and memory limits. Second, outputs of judging
|
|
|
|
|
tasks need to be checked for correctness (represented by return value or by data
|
|
|
|
|
on standard output) and should not fail on time or memory limits. This division
|
|
|
|
|
can be transparent for backend, each task is executed the same way. But frontend
|
|
|
|
|
must know which tasks from whole job are important and what is their kind. It is
|
|
|
|
|
reasonable, to keep this piece of information alongside the tasks in job
|
|
|
|
|
configuration, so each task can have a label about its purpose. Unlabeled tasks
|
|
|
|
|
have an internal type _inner_. There are four categories of tasks:
|
|
|
|
|
on standard output) and should not fail. This division can be transparent for
|
|
|
|
|
backend, each task is executed the same way. But frontend must know which tasks
|
|
|
|
|
from whole job are important and what is their kind. It is reasonable, to keep
|
|
|
|
|
this piece of information alongside the tasks in job configuration, so each task
|
|
|
|
|
can have a label about its purpose. Unlabeled tasks have an internal type
|
|
|
|
|
_inner_. There are four categories of tasks:
|
|
|
|
|
|
|
|
|
|
- _initiation_ -- setting up the environment, compiling code, etc.; for users
|
|
|
|
|
failure means error in their sources which are not compatible with running it
|
|
|
|
@ -724,29 +717,21 @@ what kind of reward for users solutions should be chosen.
|
|
|
|
|
At first let us focus on all kinds of outputs from executed programs within job.
|
|
|
|
|
Out of discussion is that supervisors should be able to view almost all outputs
|
|
|
|
|
from solutions if they choose them to be visible and recorded. This feature is
|
|
|
|
|
critical in debugging either whole exercises or users solutions. But should it
|
|
|
|
|
be default behaviour to record every output? Absolutely not, supervisor should
|
|
|
|
|
have a choice to turn it on, but discarding the outputs has to be the default
|
|
|
|
|
option. Even without this functionality a file base around whole ReCodEx system
|
|
|
|
|
can become quite large and on top of that outputs from executed programs can be
|
|
|
|
|
sometimes very extensive. Storing this amount of data is inefficient and
|
|
|
|
|
unnecessary to most of the solutions. However, on supervisor request this
|
|
|
|
|
feature should be available.
|
|
|
|
|
|
|
|
|
|
More interesting question is what should regular users see from execution of
|
|
|
|
|
their solution. Simple answer is of course that they should not see anything
|
|
|
|
|
which is partly true. Outputs from their programs can be anything and users can
|
|
|
|
|
somehow analyze inputs or even redirect them to output. So outputs from
|
|
|
|
|
execution should not be visible at all or under very special circumstances. But
|
|
|
|
|
that is not so straightforward for compilation or other kinds of initiation,
|
|
|
|
|
where it really depends on the particular case. Generally it is quite harmless
|
|
|
|
|
to display user some kind of compilation error which can help a lot during
|
|
|
|
|
troubleshooting. Of course again this kind of functionality should be
|
|
|
|
|
configurable by supervisors and disabled by default. There is also the last kind
|
|
|
|
|
of tasks which can output some information which is evaluation tasks. Output of
|
|
|
|
|
these tasks is somehow important to whole system and again can contain some
|
|
|
|
|
information about inputs or reference outputs. So outputs of evaluation tasks
|
|
|
|
|
should not be visible to regular users too.
|
|
|
|
|
critical in debugging either whole exercises or users solutions. Supervisor
|
|
|
|
|
should have a choice to turn on preserving the data while the default behaviour
|
|
|
|
|
is to discard them to keep a file base around whole ReCodEx system in sensible
|
|
|
|
|
limits.
|
|
|
|
|
|
|
|
|
|
More interesting question is if students should see the logs from execution of
|
|
|
|
|
their solution. Usual approach is to keep these information private because of
|
|
|
|
|
possibility of leaking input data. This may lead students to hack their
|
|
|
|
|
solutions to pass just the ReCodEx testing cases instead of properly solving the
|
|
|
|
|
assigned problem. Martin Mareš strongly recommended to use this strategy of
|
|
|
|
|
hiding sensitive data too, so ReCodEx does. One exception are compilation
|
|
|
|
|
outputs which can help students a lot during troubleshooting. These logs shall
|
|
|
|
|
be visible unless the supervisor decides otherwise. Note, that due to lack of
|
|
|
|
|
frontend developers, this feature was not implemented in the very first release
|
|
|
|
|
of ReCodEx, but will be definitely available in the future.
|
|
|
|
|
|
|
|
|
|
The overall concept of grading solutions was presented earlier. To briefly
|
|
|
|
|
remind that, backend returns only exact measured values (used time and memory,
|
|
|
|
@ -799,7 +784,7 @@ factor. There are several ways how to save structured data:
|
|
|
|
|
- relational database
|
|
|
|
|
|
|
|
|
|
Another important factor is amount and size of stored data. Our guess is about
|
|
|
|
|
1000 users, 100 exercises, 200 assignments per year and 400000 unique solutions
|
|
|
|
|
1000 users, 100 exercises, 200 assignments per year and 200000 unique solutions
|
|
|
|
|
per year. The data are mostly structured and there are a lot of them with the
|
|
|
|
|
same format. For example, there is a thousand of users and each one has the same
|
|
|
|
|
values -- name, email, age, etc. These kind of data are relatively small, name
|
|
|
|
@ -1449,8 +1434,8 @@ of connection with no message loss.
|
|
|
|
|
### API server
|
|
|
|
|
|
|
|
|
|
The API server must handle HTTP requests and manage the state of the application
|
|
|
|
|
in some kind of a database. It must also be able to communicate with the
|
|
|
|
|
backend over ZeroMQ.
|
|
|
|
|
in some kind of a database. It must also be able to communicate with the backend
|
|
|
|
|
over ZeroMQ.
|
|
|
|
|
|
|
|
|
|
We considered several technologies which could be used:
|
|
|
|
|
|
|
|
|
@ -1566,6 +1551,36 @@ including generating the signature and signature verification is done through a
|
|
|
|
|
widely used third-party library which lowers the risk of having a bug in the
|
|
|
|
|
implementation of this critical security feature.
|
|
|
|
|
|
|
|
|
|
#### Forgotten password
|
|
|
|
|
|
|
|
|
|
With authentication and some sort of dealing with passwords is related a problem
|
|
|
|
|
with forgotten credentials, especially passwords. People easily forget them and
|
|
|
|
|
there has to be some kind of mechanism to retrieve a new password or change the
|
|
|
|
|
old one. Problem is that it cannot be done in totally secure way, but we can at
|
|
|
|
|
least come quite close to it. First, there are absolutely not secure and
|
|
|
|
|
recommendable ways how to handle that, for example sending the old password
|
|
|
|
|
through email. A better, but still not secure solution is to generate a new one
|
|
|
|
|
and again send it through email. This solution was provided in CodEx, users had
|
|
|
|
|
to write an email to administrator, who generated a new password and sent it
|
|
|
|
|
back to the sender. This simple solution could be also automated, but
|
|
|
|
|
administrator had quite a big control over whole process. This might come in
|
|
|
|
|
handy if there could be some additional checkups for example, but on the other
|
|
|
|
|
hand it can be quite time consuming.
|
|
|
|
|
|
|
|
|
|
Probably the best solution which is often used and is fairly secure is
|
|
|
|
|
following. Let us consider only case in which all users have to fill their
|
|
|
|
|
email addresses into the system and these addresses are safely in the hands of
|
|
|
|
|
the right users. When user finds out that he/she does not remember a password,
|
|
|
|
|
he/she requests a password reset and fill in his/her unique identifier; it might
|
|
|
|
|
be email or unique nickname. Based on matched user account the system generates
|
|
|
|
|
unique access token and sends it to user via email address. This token should be
|
|
|
|
|
time limited and usable only once, so it cannot be misused. User then takes the
|
|
|
|
|
token or URL address which is provided in the email and go to the system's
|
|
|
|
|
appropriate section, where new password can be set. After that user can sign in
|
|
|
|
|
with his/her new password. As previously stated, this solution is quite safe and
|
|
|
|
|
user can handle it on its own, so administrator does not have to worry about it.
|
|
|
|
|
That is the main reason why this approach was chosen to be used.
|
|
|
|
|
|
|
|
|
|
#### Uploading files
|
|
|
|
|
|
|
|
|
|
There are two cases when users need to upload files using the API -- submitting
|
|
|
|
|