master
Petr Stefan 8 years ago
parent 6bc5d3cdaf
commit 96ea9370cb
No known key found for this signature in database
GPG Key ID: B1D74F2C9C7433D3

@ -78,7 +78,7 @@ the code without running it; safe, but not very precise) or dynamically (run the
code on testing inputs with checking the outputs against reference ones; needs
sandboxing, but provides good real world experience).
<!--
<!---
*Simon*: I am not very sure about the formulation of 'our university' - shouldn't
we rather say 'The Charles University in Prague' instead?
-->
@ -94,15 +94,16 @@ at our university.
## Assignment
The major goal of this project is to create a grading application that will be
used for programming classes at the Faculty of Mathematics and Physics of the Charles
University in Prague. However, the application should be designed in a modular fashion so
that it can be easily extended or modified to make other ways of using it possible.
used for programming classes at the Faculty of Mathematics and Physics of the
Charles University in Prague. However, the application should be designed in a
modular fashion so that it can be easily extended or modified to make other ways
of using it possible.
The project has a great starting point -- there is an old grading system
currently used at the university (CodEx), so its flaws and weaknesses can be
addressed. Furthermore, many teachers are willing to use and test the new system.
Following requirements were collected both from our personal experience with
CodEx and from teachers' requests.
addressed. Furthermore, many teachers are willing to use and test the new
system. Following requirements were collected both from our personal experience
with CodEx and from teachers' requests.
### Basic grading system requirements:
@ -110,35 +111,38 @@ These are the features which are necessary for any system for evaluation of
programming coding assignments used in any university programming course:
- students can use an intuitive user interface for interaction with the system,
mainly for viewing assigned exercises, uploading their own solutions to the assignments,
and viewing the results of the solutions after an automatic evaluation is finished
- teachers can create exercises including textual description, sample inputs and correct
reference outputs (for example "sum all numbers from given file and write the
result to the standard output")
mainly for viewing assigned exercises, uploading their own solutions to the
assignments, and viewing the results of the solutions after an automatic
evaluation is finished
- teachers can create exercises including textual description, sample inputs and
correct reference outputs (for example "sum all numbers from given file and
write the result to the standard output")
- teachers can assigning an existing exercise to their class with some specific
properties set (deadlines, etc.)
- teachers can specify their scale of points which will be awarted to the students
depending on the correctness of his/her solution (expressed in percentage points)
- teachers can specify their scale of points which will be awarted to the
students depending on the correctness of his/her solution (expressed in
percentage points)
- teachers can view all of the solutions their students submitted and also the
results of the evaluations and they can override the automatically assigned points
to the solutions manually
- teachers can see the statistics of their classes and individual students
of these claseese
- administrators can depend on a safe environment in which the students' solutions
will be executed
- administrators can manage users with support of roles (at least two -- _student_ and
_supervisor_)
results of the evaluations and they can override the automatically assigned
points to the solutions manually
- teachers can see the statistics of their classes and individual students of
these claseese
- administrators can depend on a safe environment in which the students'
solutions will be executed
- administrators can manage users with support of roles (at least two --
_student_ and _supervisor_)
CodEx satisfies all these requirements and a few more that originate from the
way courses are organized at our university -- for example, users have roles
(_student_, _supervisor_ and _administrator_) that determine their capabilities
in the system and students are divided into groups that correspond to lab groups.
in the system and students are divided into groups that correspond to lab
groups.
However, further requirements arose during the ten year long lifetime of the old
system. There are not many ways to improve it from the perspective of a student,
but a lot of feature requests came from administrators and supervisors.
The ideas were mostly gathered from meetings with faculty staff involved
with the current system.
but a lot of feature requests came from administrators and supervisors. The
ideas were mostly gathered from meetings with faculty staff involved with the
current system.
### Requested features for the new system:
@ -173,8 +177,8 @@ short survey at universities, programming contests, and other available tools.
## Related work
This is not a complete list of available evaluators, but only a few projects
which are used these days and can be an inspiration for our project. Each project from the
list has a brief description and some key features mentioned.
which are used these days and can be an inspiration for our project. Each
project from the list has a brief description and some key features mentioned.
### CodEx
@ -220,10 +224,11 @@ several drawbacks. The main ones are:
### Progtest
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT ČVUT](https://fit.cvut.cz)
in Prague. As far as we know it is used for C/C++, Bash programming and knowledge-based quizzes.
There are several bonus points and penalties and also a few hints what is failing in the submitted
solution. It is very strict on source code quality, for example `-pedantic` option of GCC,
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT
ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
Bash programming and knowledge-based quizzes. There are several bonus points
and penalties and also a few hints what is failing in the submitted solution. It
is very strict on source code quality, for example `-pedantic` option of GCC,
Valgrind for memory leaks or array boundaries checks via `mudflap` library.
### Codility
@ -268,13 +273,14 @@ recruiters and also some universities.
## ReCodEx goals
None of the existing systems we came across is capable of all the required features
of the new system. There is no grading system which is designed to support a complicated
evaluation pipeline, so this part is an unexplored field and has to be designed with caution.
Also, no project is modern and extensible so it could be used as a base for ReCodEx.
After considering all these facts, it was clear that a new system has to be written
from scratch. This implies, that only a subset of all the features will be implemented
in the first version, the other in the following ones.
None of the existing systems we came across is capable of all the required
features of the new system. There is no grading system which is designed to
support a complicated evaluation pipeline, so this part is an unexplored field
and has to be designed with caution. Also, no project is modern and extensible
so it could be used as a base for ReCodEx. After considering all these facts,
it was clear that a new system has to be written from scratch. This implies,
that only a subset of all the features will be implemented in the first version,
the other in the following ones.
Gathered features are categorized based on priorities for the whole system. The
highest priority has main functionality similar to current CodEx. It is a base
@ -291,25 +297,25 @@ side) and command-line submit tool. Plagiarism detection is not likely to be
part of any release in near future unless someone other makes the engine. The
detection problem is too hard to be solved as part of this project.
We named the project as **ReCodEx -- ReCodEx Code Examiner**. The name should point
to the old CodEx, but also reflect the new approach to solve issues.
We named the project as **ReCodEx -- ReCodEx Code Examiner**. The name should
point to the old CodEx, but also reflect the new approach to solve issues.
**Re** as part of the name means redesigned, rewritten, renewed, or restarted.
At this point there is a clear idea how the new system will be used and what are the
major enhancements for future releases. With this in mind, the overall
At this point there is a clear idea how the new system will be used and what are
the major enhancements for future releases. With this in mind, the overall
architecture can be sketched. From the previous research, we set up several
goals, which the new system should have. They mostly reflect drawbacks of the current
version of CodEx and some reasonable wishes of university users. Most notable
features are following:
goals, which the new system should have. They mostly reflect drawbacks of the
current version of CodEx and some reasonable wishes of university users. Most
notable features are following:
- modern HTML5 web frontend written in JavaScript using a suitable framework
- REST API implemented in PHP, communicating with database, evaluation backend and a file
server
- evaluation backend implemented as a distributed system on top of a message queue framework
(ZeroMQ) with master-worker architecture
<!-- @todo: WTF is worker??? The concept has not been introduced yet! -->
- worker with basic support of the Windows environment (without sandbox, no general
purpose suitable tool available yet)
- REST API implemented in PHP, communicating with database, evaluation backend
and a file server
- evaluation backend implemented as a distributed system on top of a message
queue framework (ZeroMQ) with master-worker architecture <!-- @todo: WTF is
worker??? The concept has not been introduced yet! -->
- worker with basic support of the Windows environment (without sandbox, no
general purpose suitable tool available yet)
- evaluation procedure configured in a YAML file, compound of small tasks
connected into an arbitrary oriented acyclic graph
@ -319,11 +325,11 @@ The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind typical usage scenarios of the
system and try to make these typical tasks as simple as possible.
The system has a database of users. Each user has a role assigned,
which correspond to his/her privileges. User can be logged in via
email and password or using the university system. There are groups of users, which
corresponds to the lectured courses. Groups can be hierarchically ordered to reflect
additional metadata such as the academic year. For example, a reasonable group hierarchy
The system has a database of users. Each user has a role assigned, which
correspond to his/her privileges. User can be logged in via email and password
or using the university system. There are groups of users, which corresponds to
the lectured courses. Groups can be hierarchically ordered to reflect additional
metadata such as the academic year. For example, a reasonable group hierarchy
can look like this:
```
@ -337,67 +343,63 @@ Summer term 2016
```
In this example, students are members of the leaf groups, the higher level groups
are just for keeping the related groups together. The hierarchy tree can be modified and
altered to fit specific needs of the university or any other organization, even the
flat structure (i.e., no hierarchy) is possible.
In this example, students are members of the leaf groups, the higher level
groups are just for keeping the related groups together. The hierarchy tree can
be modified and altered to fit specific needs of the university or any other
organization, even the flat structure (i.e., no hierarchy) is possible.
One user can be part of multiple groups and also one group can of course have multiple
users. Each user in a group has also a specific role for the given group.
Priviledged user (supervisor) can assign a new exercise in his/her group, change assignment
details, view results of other users and manually change them. Normal user (student) can
join a group, get list of assigned exercises, view assignment detail, submit
his/her solution and view the results of the evaluation.
One user can be part of multiple groups and also one group can of course have
multiple users. Each user in a group has also a specific role for the given
group. Priviledged user (supervisor) can assign a new exercise in his/her
group, change assignment details, view results of other users and manually
change them. Normal user (student) can join a group, get list of assigned
exercises, view assignment detail, submit his/her solution and view the results
of the evaluation.
Database of exercises (algorithmic problems) is another part of the project.
Each exercise consists of a text in multiple language variants, an evaluation
configuration and a set of inputs and reference outputs. Exercises are created by
instructed priviledged users. Assigning an exercise to a group means to choose
one of the available exercises and specifying additional properties. An assignment
has a deadline (optionally a second deadline), a maximum amount of points,
a configuration for calculating the final score, a maximum number of submissions,
and a list of supported runtime environemnts (e.g., programming languages) including
specific time and memory limits for the sandboxed tasks.
configuration and a set of inputs and reference outputs. Exercises are created
by instructed priviledged users. Assigning an exercise to a group means to
choose one of the available exercises and specifying additional properties. An
assignment has a deadline (optionally a second deadline), a maximum amount of
points, a configuration for calculating the final score, a maximum number of
submissions, and a list of supported runtime environemnts (e.g., programming
languages) including specific time and memory limits for the sandboxed tasks.
#### Exercise evaluation chain
The most important part of the system is the evaluation of the solutions
submitted by the users for their assigned exercises.
<!-- I really think this part is redundant - or should be described in a totally different way -->
~~For imaginary system architecture _UI_, _API_, _Broker_ and _Worker_ this goes as follows.~~
First thing users have to do is to submit their solutions to _UI_ which provides
interface to upload files and then submit them. _UI_ sends a request to _API_
that user wants to evaluate assignment with provided files.
_API_ checks the assignment invariants (deadlines, count of submissions, ...)
and stores submitted files. The runtime environment is automatically detected
based on input files and suitable exercise configuration variant is chosen (one
exercise can have multiple variants, for example C and Java languages). Matching
exercise configuration is then send to _Broker_ alongside solution source files.
_Broker_ has to find suitable _Worker_ for execution of this particular
submission. This decission is made based on capabilities of each _Worker_ and
job requirements. When a match is found, the job is held until the _Worker_ is
jobless and can receive an evaluation request.
_Worker_ gets evaluation request with source files and job configuration. The
configuration is parsed into small tasks with simple piece of work. Evaluation
itself goes in direction of tasks ordering. It is crucial to keep _Worker_
machine secure and stable, so isolated sandboxed environment is used when
dealing with unknown source code. When the execution is finished, results are
uploaded back.
_API_ is notified about finished job from _Broker_. The results are parsed and
results of important tasks (comparing actual and expected results) saved into
database. Also, points are calculated depending on solution correctness and
assignment configuration.
_UI_ then only displays results summary fetched from the _API_. Presented data
includes overview which part succeeded and which failed (optionally with reason
like "memory limit exceeded") and amount of awarded points.
submitted by the users for their assigned exercises. Concepts of consecutive
steps from source code of solution to results is described on architecture with
two layer -- presentation (frontend) and executive (backend).
First thing users have to do is to submit their solutions to _frontend_ which
provides interface to upload files and then submit them. It checks the
assignment invariants (deadlines, count of submissions, ...) and stores
submitted files. The runtime environment is automatically detected based on
input files and suitable exercise configuration variant is chosen (one exercise
can have multiple variants, for example C and Java languages). Matching exercise
configuration is then send to _backend_ alongside solution source files.
_Backend_ can have multiple engines to allow processing more jobs in parallel
and a loadbalancer, which tracks states of incoming jobs and performs scheduling
of them. The decission is made based on capabilities of each engine and also job
requirements. When a match is found, the job is held until the particular engine
is jobless and can receive an evaluation request.
Job processing itself stars with obtaining source files and job configuration.
The configuration is parsed into small tasks with simple piece of work.
Evaluation itself goes in direction of tasks ordering. It is crucial to keep
executive computer secure and stable, so isolated sandboxed environment is used
when dealing with unknown source code. When the execution is finished, results
are uploaded back to _frontend_.
The _frontend_ is immediately notified about finished job. The outcomes are
parsed and results of important tasks (comparing actual and expected results)
saved into storage. Also, points are calculated depending on solution
correctness and assignment configuration. Data presented back to users includes
overview which part succeeded and which failed (optionally with reason like
"memory limit exceeded") and amount of awarded points.
# Analysis

Loading…
Cancel
Save