master
Petr Stefan 8 years ago
parent 6bc5d3cdaf
commit 96ea9370cb
No known key found for this signature in database
GPG Key ID: B1D74F2C9C7433D3

@ -78,7 +78,7 @@ the code without running it; safe, but not very precise) or dynamically (run the
code on testing inputs with checking the outputs against reference ones; needs code on testing inputs with checking the outputs against reference ones; needs
sandboxing, but provides good real world experience). sandboxing, but provides good real world experience).
<!-- <!---
*Simon*: I am not very sure about the formulation of 'our university' - shouldn't *Simon*: I am not very sure about the formulation of 'our university' - shouldn't
we rather say 'The Charles University in Prague' instead? we rather say 'The Charles University in Prague' instead?
--> -->
@ -94,15 +94,16 @@ at our university.
## Assignment ## Assignment
The major goal of this project is to create a grading application that will be The major goal of this project is to create a grading application that will be
used for programming classes at the Faculty of Mathematics and Physics of the Charles used for programming classes at the Faculty of Mathematics and Physics of the
University in Prague. However, the application should be designed in a modular fashion so Charles University in Prague. However, the application should be designed in a
that it can be easily extended or modified to make other ways of using it possible. modular fashion so that it can be easily extended or modified to make other ways
of using it possible.
The project has a great starting point -- there is an old grading system The project has a great starting point -- there is an old grading system
currently used at the university (CodEx), so its flaws and weaknesses can be currently used at the university (CodEx), so its flaws and weaknesses can be
addressed. Furthermore, many teachers are willing to use and test the new system. addressed. Furthermore, many teachers are willing to use and test the new
Following requirements were collected both from our personal experience with system. Following requirements were collected both from our personal experience
CodEx and from teachers' requests. with CodEx and from teachers' requests.
### Basic grading system requirements: ### Basic grading system requirements:
@ -110,35 +111,38 @@ These are the features which are necessary for any system for evaluation of
programming coding assignments used in any university programming course: programming coding assignments used in any university programming course:
- students can use an intuitive user interface for interaction with the system, - students can use an intuitive user interface for interaction with the system,
mainly for viewing assigned exercises, uploading their own solutions to the assignments, mainly for viewing assigned exercises, uploading their own solutions to the
and viewing the results of the solutions after an automatic evaluation is finished assignments, and viewing the results of the solutions after an automatic
- teachers can create exercises including textual description, sample inputs and correct evaluation is finished
reference outputs (for example "sum all numbers from given file and write the - teachers can create exercises including textual description, sample inputs and
result to the standard output") correct reference outputs (for example "sum all numbers from given file and
write the result to the standard output")
- teachers can assigning an existing exercise to their class with some specific - teachers can assigning an existing exercise to their class with some specific
properties set (deadlines, etc.) properties set (deadlines, etc.)
- teachers can specify their scale of points which will be awarted to the students - teachers can specify their scale of points which will be awarted to the
depending on the correctness of his/her solution (expressed in percentage points) students depending on the correctness of his/her solution (expressed in
percentage points)
- teachers can view all of the solutions their students submitted and also the - teachers can view all of the solutions their students submitted and also the
results of the evaluations and they can override the automatically assigned points results of the evaluations and they can override the automatically assigned
to the solutions manually points to the solutions manually
- teachers can see the statistics of their classes and individual students - teachers can see the statistics of their classes and individual students of
of these claseese these claseese
- administrators can depend on a safe environment in which the students' solutions - administrators can depend on a safe environment in which the students'
will be executed solutions will be executed
- administrators can manage users with support of roles (at least two -- _student_ and - administrators can manage users with support of roles (at least two --
_supervisor_) _student_ and _supervisor_)
CodEx satisfies all these requirements and a few more that originate from the CodEx satisfies all these requirements and a few more that originate from the
way courses are organized at our university -- for example, users have roles way courses are organized at our university -- for example, users have roles
(_student_, _supervisor_ and _administrator_) that determine their capabilities (_student_, _supervisor_ and _administrator_) that determine their capabilities
in the system and students are divided into groups that correspond to lab groups. in the system and students are divided into groups that correspond to lab
groups.
However, further requirements arose during the ten year long lifetime of the old However, further requirements arose during the ten year long lifetime of the old
system. There are not many ways to improve it from the perspective of a student, system. There are not many ways to improve it from the perspective of a student,
but a lot of feature requests came from administrators and supervisors. but a lot of feature requests came from administrators and supervisors. The
The ideas were mostly gathered from meetings with faculty staff involved ideas were mostly gathered from meetings with faculty staff involved with the
with the current system. current system.
### Requested features for the new system: ### Requested features for the new system:
@ -173,8 +177,8 @@ short survey at universities, programming contests, and other available tools.
## Related work ## Related work
This is not a complete list of available evaluators, but only a few projects This is not a complete list of available evaluators, but only a few projects
which are used these days and can be an inspiration for our project. Each project from the which are used these days and can be an inspiration for our project. Each
list has a brief description and some key features mentioned. project from the list has a brief description and some key features mentioned.
### CodEx ### CodEx
@ -220,10 +224,11 @@ several drawbacks. The main ones are:
### Progtest ### Progtest
[Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT ČVUT](https://fit.cvut.cz) [Progtest](https://progtest.fit.cvut.cz/) is private project of [FIT
in Prague. As far as we know it is used for C/C++, Bash programming and knowledge-based quizzes. ČVUT](https://fit.cvut.cz) in Prague. As far as we know it is used for C/C++,
There are several bonus points and penalties and also a few hints what is failing in the submitted Bash programming and knowledge-based quizzes. There are several bonus points
solution. It is very strict on source code quality, for example `-pedantic` option of GCC, and penalties and also a few hints what is failing in the submitted solution. It
is very strict on source code quality, for example `-pedantic` option of GCC,
Valgrind for memory leaks or array boundaries checks via `mudflap` library. Valgrind for memory leaks or array boundaries checks via `mudflap` library.
### Codility ### Codility
@ -268,13 +273,14 @@ recruiters and also some universities.
## ReCodEx goals ## ReCodEx goals
None of the existing systems we came across is capable of all the required features None of the existing systems we came across is capable of all the required
of the new system. There is no grading system which is designed to support a complicated features of the new system. There is no grading system which is designed to
evaluation pipeline, so this part is an unexplored field and has to be designed with caution. support a complicated evaluation pipeline, so this part is an unexplored field
Also, no project is modern and extensible so it could be used as a base for ReCodEx. and has to be designed with caution. Also, no project is modern and extensible
After considering all these facts, it was clear that a new system has to be written so it could be used as a base for ReCodEx. After considering all these facts,
from scratch. This implies, that only a subset of all the features will be implemented it was clear that a new system has to be written from scratch. This implies,
in the first version, the other in the following ones. that only a subset of all the features will be implemented in the first version,
the other in the following ones.
Gathered features are categorized based on priorities for the whole system. The Gathered features are categorized based on priorities for the whole system. The
highest priority has main functionality similar to current CodEx. It is a base highest priority has main functionality similar to current CodEx. It is a base
@ -291,25 +297,25 @@ side) and command-line submit tool. Plagiarism detection is not likely to be
part of any release in near future unless someone other makes the engine. The part of any release in near future unless someone other makes the engine. The
detection problem is too hard to be solved as part of this project. detection problem is too hard to be solved as part of this project.
We named the project as **ReCodEx -- ReCodEx Code Examiner**. The name should point We named the project as **ReCodEx -- ReCodEx Code Examiner**. The name should
to the old CodEx, but also reflect the new approach to solve issues. point to the old CodEx, but also reflect the new approach to solve issues.
**Re** as part of the name means redesigned, rewritten, renewed, or restarted. **Re** as part of the name means redesigned, rewritten, renewed, or restarted.
At this point there is a clear idea how the new system will be used and what are the At this point there is a clear idea how the new system will be used and what are
major enhancements for future releases. With this in mind, the overall the major enhancements for future releases. With this in mind, the overall
architecture can be sketched. From the previous research, we set up several architecture can be sketched. From the previous research, we set up several
goals, which the new system should have. They mostly reflect drawbacks of the current goals, which the new system should have. They mostly reflect drawbacks of the
version of CodEx and some reasonable wishes of university users. Most notable current version of CodEx and some reasonable wishes of university users. Most
features are following: notable features are following:
- modern HTML5 web frontend written in JavaScript using a suitable framework - modern HTML5 web frontend written in JavaScript using a suitable framework
- REST API implemented in PHP, communicating with database, evaluation backend and a file - REST API implemented in PHP, communicating with database, evaluation backend
server and a file server
- evaluation backend implemented as a distributed system on top of a message queue framework - evaluation backend implemented as a distributed system on top of a message
(ZeroMQ) with master-worker architecture queue framework (ZeroMQ) with master-worker architecture <!-- @todo: WTF is
<!-- @todo: WTF is worker??? The concept has not been introduced yet! --> worker??? The concept has not been introduced yet! -->
- worker with basic support of the Windows environment (without sandbox, no general - worker with basic support of the Windows environment (without sandbox, no
purpose suitable tool available yet) general purpose suitable tool available yet)
- evaluation procedure configured in a YAML file, compound of small tasks - evaluation procedure configured in a YAML file, compound of small tasks
connected into an arbitrary oriented acyclic graph connected into an arbitrary oriented acyclic graph
@ -319,11 +325,11 @@ The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind typical usage scenarios of the To achieve this, it is crucial to keep in mind typical usage scenarios of the
system and try to make these typical tasks as simple as possible. system and try to make these typical tasks as simple as possible.
The system has a database of users. Each user has a role assigned, The system has a database of users. Each user has a role assigned, which
which correspond to his/her privileges. User can be logged in via correspond to his/her privileges. User can be logged in via email and password
email and password or using the university system. There are groups of users, which or using the university system. There are groups of users, which corresponds to
corresponds to the lectured courses. Groups can be hierarchically ordered to reflect the lectured courses. Groups can be hierarchically ordered to reflect additional
additional metadata such as the academic year. For example, a reasonable group hierarchy metadata such as the academic year. For example, a reasonable group hierarchy
can look like this: can look like this:
``` ```
@ -337,67 +343,63 @@ Summer term 2016
``` ```
In this example, students are members of the leaf groups, the higher level groups In this example, students are members of the leaf groups, the higher level
are just for keeping the related groups together. The hierarchy tree can be modified and groups are just for keeping the related groups together. The hierarchy tree can
altered to fit specific needs of the university or any other organization, even the be modified and altered to fit specific needs of the university or any other
flat structure (i.e., no hierarchy) is possible. organization, even the flat structure (i.e., no hierarchy) is possible.
One user can be part of multiple groups and also one group can of course have multiple One user can be part of multiple groups and also one group can of course have
users. Each user in a group has also a specific role for the given group. multiple users. Each user in a group has also a specific role for the given
Priviledged user (supervisor) can assign a new exercise in his/her group, change assignment group. Priviledged user (supervisor) can assign a new exercise in his/her
details, view results of other users and manually change them. Normal user (student) can group, change assignment details, view results of other users and manually
join a group, get list of assigned exercises, view assignment detail, submit change them. Normal user (student) can join a group, get list of assigned
his/her solution and view the results of the evaluation. exercises, view assignment detail, submit his/her solution and view the results
of the evaluation.
Database of exercises (algorithmic problems) is another part of the project. Database of exercises (algorithmic problems) is another part of the project.
Each exercise consists of a text in multiple language variants, an evaluation Each exercise consists of a text in multiple language variants, an evaluation
configuration and a set of inputs and reference outputs. Exercises are created by configuration and a set of inputs and reference outputs. Exercises are created
instructed priviledged users. Assigning an exercise to a group means to choose by instructed priviledged users. Assigning an exercise to a group means to
one of the available exercises and specifying additional properties. An assignment choose one of the available exercises and specifying additional properties. An
has a deadline (optionally a second deadline), a maximum amount of points, assignment has a deadline (optionally a second deadline), a maximum amount of
a configuration for calculating the final score, a maximum number of submissions, points, a configuration for calculating the final score, a maximum number of
and a list of supported runtime environemnts (e.g., programming languages) including submissions, and a list of supported runtime environemnts (e.g., programming
specific time and memory limits for the sandboxed tasks. languages) including specific time and memory limits for the sandboxed tasks.
#### Exercise evaluation chain #### Exercise evaluation chain
The most important part of the system is the evaluation of the solutions The most important part of the system is the evaluation of the solutions
submitted by the users for their assigned exercises. submitted by the users for their assigned exercises. Concepts of consecutive
steps from source code of solution to results is described on architecture with
<!-- I really think this part is redundant - or should be described in a totally different way --> two layer -- presentation (frontend) and executive (backend).
~~For imaginary system architecture _UI_, _API_, _Broker_ and _Worker_ this goes as follows.~~ First thing users have to do is to submit their solutions to _frontend_ which
provides interface to upload files and then submit them. It checks the
First thing users have to do is to submit their solutions to _UI_ which provides assignment invariants (deadlines, count of submissions, ...) and stores
interface to upload files and then submit them. _UI_ sends a request to _API_ submitted files. The runtime environment is automatically detected based on
that user wants to evaluate assignment with provided files. input files and suitable exercise configuration variant is chosen (one exercise
can have multiple variants, for example C and Java languages). Matching exercise
_API_ checks the assignment invariants (deadlines, count of submissions, ...) configuration is then send to _backend_ alongside solution source files.
and stores submitted files. The runtime environment is automatically detected
based on input files and suitable exercise configuration variant is chosen (one _Backend_ can have multiple engines to allow processing more jobs in parallel
exercise can have multiple variants, for example C and Java languages). Matching and a loadbalancer, which tracks states of incoming jobs and performs scheduling
exercise configuration is then send to _Broker_ alongside solution source files. of them. The decission is made based on capabilities of each engine and also job
requirements. When a match is found, the job is held until the particular engine
_Broker_ has to find suitable _Worker_ for execution of this particular is jobless and can receive an evaluation request.
submission. This decission is made based on capabilities of each _Worker_ and
job requirements. When a match is found, the job is held until the _Worker_ is Job processing itself stars with obtaining source files and job configuration.
jobless and can receive an evaluation request. The configuration is parsed into small tasks with simple piece of work.
Evaluation itself goes in direction of tasks ordering. It is crucial to keep
_Worker_ gets evaluation request with source files and job configuration. The executive computer secure and stable, so isolated sandboxed environment is used
configuration is parsed into small tasks with simple piece of work. Evaluation when dealing with unknown source code. When the execution is finished, results
itself goes in direction of tasks ordering. It is crucial to keep _Worker_ are uploaded back to _frontend_.
machine secure and stable, so isolated sandboxed environment is used when
dealing with unknown source code. When the execution is finished, results are The _frontend_ is immediately notified about finished job. The outcomes are
uploaded back. parsed and results of important tasks (comparing actual and expected results)
saved into storage. Also, points are calculated depending on solution
_API_ is notified about finished job from _Broker_. The results are parsed and correctness and assignment configuration. Data presented back to users includes
results of important tasks (comparing actual and expected results) saved into overview which part succeeded and which failed (optionally with reason like
database. Also, points are calculated depending on solution correctness and "memory limit exceeded") and amount of awarded points.
assignment configuration.
_UI_ then only displays results summary fetched from the _API_. Presented data
includes overview which part succeeded and which failed (optionally with reason
like "memory limit exceeded") and amount of awarded points.
# Analysis # Analysis

Loading…
Cancel
Save