|
|
|
@ -665,33 +665,34 @@ state.
|
|
|
|
|
|
|
|
|
|
There are lot of things which deserves discussion concerning results of
|
|
|
|
|
evaluation, how they should be displayed, what should be visible or not and also
|
|
|
|
|
some what kind of reward for users solutions should be chosen.
|
|
|
|
|
what kind of reward for users solutions should be chosen.
|
|
|
|
|
|
|
|
|
|
At first let us focus on all kinds of outputs from executed programs within job.
|
|
|
|
|
Out of discussion is that supervisors should be able to view almost all outputs
|
|
|
|
|
from solutions if they choose them to be visible and recorded. This feature is
|
|
|
|
|
critical in debugging either whole exercises or users solutions. But should it
|
|
|
|
|
be default behaviour to record every output? Absolutely not, supervisor should
|
|
|
|
|
have choice to turn it on but defaults has to be not record it. Even without
|
|
|
|
|
this functionality can be file base around whole ReCodEx system quite large and
|
|
|
|
|
on top of that outputs from executed programs can be sometimes very extensive.
|
|
|
|
|
Storing this amount of data is purely nonsense to every solution. But if
|
|
|
|
|
requested by supervisor then this feature should be available.
|
|
|
|
|
|
|
|
|
|
Question what should regular users see from execution of their solution is more
|
|
|
|
|
interesting. Simple answer is of course that they should not see anything which
|
|
|
|
|
is partly true. Outputs from their programs can be anything and users can
|
|
|
|
|
somehow analyse inputs or even redirect them to output. So outputs from
|
|
|
|
|
have a choice to turn it on, but discarding the outputs has to be the default
|
|
|
|
|
option. Even without this functionality a file base around whole ReCodEx system
|
|
|
|
|
can become quite large and on top of that outputs from executed programs can be
|
|
|
|
|
sometimes very extensive. Storing this amount of data is inefficient and
|
|
|
|
|
unnecessary to most of the solutions. However, on supervisor request this
|
|
|
|
|
feature should be available.
|
|
|
|
|
|
|
|
|
|
More interesting question is what should regular users see from execution of
|
|
|
|
|
their solution. Simple answer is of course that they should not see anything
|
|
|
|
|
which is partly true. Outputs from their programs can be anything and users can
|
|
|
|
|
somehow analyze inputs or even redirect them to output. So outputs from
|
|
|
|
|
execution should not be visible at all or under very special circumstances. But
|
|
|
|
|
that is not so straighforward for compilation or other kinds of initiation.
|
|
|
|
|
Well, this is another story, it really depends on the particular case. But
|
|
|
|
|
generally it is quite harmless to display user some kind of compilation error
|
|
|
|
|
which can help a lot during troubleshooting. Of course again this kind of
|
|
|
|
|
functionality should be configurable by supervisors and disabled by default.
|
|
|
|
|
There is also the last kind of tasks which can output some information which is
|
|
|
|
|
evaluation tasks. Output of these tasks is somehow important to whole system and
|
|
|
|
|
again can contain some information about inputs or reference outputs. So outputs
|
|
|
|
|
of evaluation tasks should not be visible to regular users too.
|
|
|
|
|
that is not so straightforward for compilation or other kinds of initiation,
|
|
|
|
|
where it really depends on the particular case. Generally it is quite harmless
|
|
|
|
|
to display user some kind of compilation error which can help a lot during
|
|
|
|
|
troubleshooting. Of course again this kind of functionality should be
|
|
|
|
|
configurable by supervisors and disabled by default. There is also the last kind
|
|
|
|
|
of tasks which can output some information which is evaluation tasks. Output of
|
|
|
|
|
these tasks is somehow important to whole system and again can contain some
|
|
|
|
|
information about inputs or reference outputs. So outputs of evaluation tasks
|
|
|
|
|
should not be visible to regular users too.
|
|
|
|
|
|
|
|
|
|
The overall concept of grading solutions was presented earlier. To briefly
|
|
|
|
|
remind that, backend returns only exact measured values (used time and memory,
|
|
|
|
@ -702,19 +703,20 @@ implemented and any sort of magic can return the final value.
|
|
|
|
|
|
|
|
|
|
We found out several computational possibilities. There is basic arithmetic,
|
|
|
|
|
weighted arithmetic, geometric and harmonic mean of results of each test (the
|
|
|
|
|
result is boolean succeeded/failed, optionaly has weight), some kind of
|
|
|
|
|
result is logical value succeeded/failed, optionally with weight), some kind of
|
|
|
|
|
interpolation of used amount of time for each test, the same with used memory
|
|
|
|
|
amount and surely many others. To keep the project simple, we decided to design
|
|
|
|
|
apropriate interface and implement only weighted arithmetic mean computation,
|
|
|
|
|
which is used in about 90% of all assignments. Of course, diferent scheme can be
|
|
|
|
|
chosen for every assignment and also configured -- for example test weights can
|
|
|
|
|
be specified for implemented weighted arithmetic mean. Advanced ways of
|
|
|
|
|
computation can be implemented on demand when is a real demand for them.
|
|
|
|
|
appropriate interface and implement only weighted arithmetic mean computation,
|
|
|
|
|
which is used in about 90% of all assignments. Of course, different scheme can
|
|
|
|
|
be chosen for every assignment and also can be configured -- for example
|
|
|
|
|
specifying test weights for implemented weighted arithmetic mean. Advanced ways
|
|
|
|
|
of computation can be implemented on demand when there is a real demand for
|
|
|
|
|
them.
|
|
|
|
|
|
|
|
|
|
To avoid assigning points for insufficient solutions (like only printing "File
|
|
|
|
|
error" which is the valid answer in two tests), a minimal point threshold can be
|
|
|
|
|
specified. It he solution is to get less points than specified, it will get zero
|
|
|
|
|
points instead. This functionality can be empedded into grading computation
|
|
|
|
|
specified. It the solution is to get less points than specified, it will get
|
|
|
|
|
zero points instead. This functionality can be embedded into grading computation
|
|
|
|
|
algoritm itself, but it would have to be present in each implementation
|
|
|
|
|
separately, which is a bit ugly. So, this feature is separated from point
|
|
|
|
|
computation.
|
|
|
|
@ -722,14 +724,14 @@ computation.
|
|
|
|
|
Automatic grading cannot reflect all aspects of submitted code. For example,
|
|
|
|
|
structuring the code, number and quality of comments and so on. To allow
|
|
|
|
|
supervisors bring these manually checked things into grading, there is a concept
|
|
|
|
|
of bonus points. They can be positive or negative. Generaly the solution with
|
|
|
|
|
the most assigned points is marked for grading that particular solution.
|
|
|
|
|
of bonus points. They can be positive or negative. Generally the solution with
|
|
|
|
|
the most assigned points is marked for grading that particular assignment.
|
|
|
|
|
However, if supervisor is not satisfied with student solution (really bad code,
|
|
|
|
|
cheating, ...) he/she assigns the student negative bonus points. But to prevent
|
|
|
|
|
chosing another solution with more points by the system or even submitting the
|
|
|
|
|
same code again which is worth more points by students, supervisor can mark a
|
|
|
|
|
particular solution as marked and used for grading instead of solution with the
|
|
|
|
|
most points.
|
|
|
|
|
cheating, ...) he/she assigns the student negative bonus points. To prevent
|
|
|
|
|
overriding this decision by system choosing another solution with more points or
|
|
|
|
|
even student submitting the same code again which evaluates to more points,
|
|
|
|
|
supervisor can mark a particular solution as marked and used for grading instead
|
|
|
|
|
of solution with the most points.
|
|
|
|
|
|
|
|
|
|
### Persistence
|
|
|
|
|
|
|
|
|
@ -752,19 +754,19 @@ databases or formatted plain files (CSV for example) fits best for them.
|
|
|
|
|
However, the data often have to support find operation, so they have to be
|
|
|
|
|
sorted and allow random access for resolving cross references. Also, addition a
|
|
|
|
|
deletion of entries should take reasonable time (at most logaritmic time
|
|
|
|
|
complexity to number of saved values). This practicaly excludes plain files, so
|
|
|
|
|
complexity to number of saved values). This practically excludes plain files, so
|
|
|
|
|
relational database is used instead.
|
|
|
|
|
|
|
|
|
|
On the other hand, there are some data with no such great structure and much
|
|
|
|
|
larger size. These can be evaluation logs, sample input files for exercises or
|
|
|
|
|
submited sources by students. Saving this kind of data into relational database
|
|
|
|
|
submitted sources by students. Saving this kind of data into relational database
|
|
|
|
|
is not suitable, but it is better to keep them as ordinary files or store them
|
|
|
|
|
into some kind of NoSQL database. Since they are already files and does not need
|
|
|
|
|
to be backed up in multiple copies, it is easier to keep them as ordinary files
|
|
|
|
|
in filesystem. Also, this solution is more lightweight and does not require
|
|
|
|
|
additional dependencies on third-party software. File can be identified using
|
|
|
|
|
its filesystem path or unique index stored as value in relational database. Both
|
|
|
|
|
approaches are equaly good, final decission depends on actual case.
|
|
|
|
|
approaches are equally good, final decision depends on actual case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Structure of the project
|
|
|
|
@ -856,10 +858,10 @@ also possible to develop their own frontend with their own user management
|
|
|
|
|
system for their specific needs and use the possibilities of the backend without
|
|
|
|
|
any changes, as was mentioned in the previous paragraphs.
|
|
|
|
|
|
|
|
|
|
One possible configuration of ReCodEx system is ilustrated on following picture,
|
|
|
|
|
where thete is one shared backend with three workers and two separate instances
|
|
|
|
|
of whole frontend. This configuration may be suitable for MFF UK -- basic
|
|
|
|
|
programming course and KSP competition. But maybe even sharing web API and
|
|
|
|
|
One possible configuration of ReCodEx system is illustrated on following
|
|
|
|
|
picture, where there is one shared backend with three workers and two separate
|
|
|
|
|
instances of whole frontend. This configuration may be suitable for MFF UK --
|
|
|
|
|
basic programming course and KSP competition. But maybe even sharing web API and
|
|
|
|
|
fileserver with only custom instances of client (web app or own implementation)
|
|
|
|
|
is more likely to be used. Note, that connections between components are not
|
|
|
|
|
fully accurate.
|
|
|
|
@ -1068,7 +1070,7 @@ as network communication which is quite handy and solves problems with threads
|
|
|
|
|
synchronization and such.
|
|
|
|
|
|
|
|
|
|
At this point we have worker with two internal parts listening one and execution
|
|
|
|
|
one. Implementation of first one is quite straighforward and clear. So lets
|
|
|
|
|
one. Implementation of first one is quite straightforward and clear. So lets
|
|
|
|
|
discuss what should be happening in execution subsystem. Jobs as work units can
|
|
|
|
|
quite vary and do completely different things, that means configuration and
|
|
|
|
|
worker has to be prepared for this kind of generality. Configuration and its
|
|
|
|
@ -1342,7 +1344,7 @@ development.
|
|
|
|
|
Users want to view real time evaluation progress of their solution. It can be
|
|
|
|
|
easily done with established double-sided connection stream, but it is hard to
|
|
|
|
|
achieve with web technologies. HTTP protocol works differently on separate
|
|
|
|
|
requests basis with no longterm connection. However, there is widely used
|
|
|
|
|
requests basis with no long term connection. However, there is widely used
|
|
|
|
|
technology to solve this problem, WebSocket protocol.
|
|
|
|
|
|
|
|
|
|
Working with WebSocket protocol from the backend is possible, but not ideal from
|
|
|
|
@ -1376,7 +1378,7 @@ following picture.
|
|
|
|
|
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
|
|
|
|
|
|
|
|
|
|
The message channel inputing the monitor uses ZeroMQ as main message framework
|
|
|
|
|
used by backend. This decision keeps rest of backend avare of used
|
|
|
|
|
used by backend. This decision keeps rest of backend aware of used
|
|
|
|
|
communication protocol and related libraries. Output channel is WebSocket as a
|
|
|
|
|
protocol for sending messages to web browsers. In Python, there are several
|
|
|
|
|
WebSocket libraries. The most popular one is `websockets` in cooperation with
|
|
|
|
@ -1403,8 +1405,8 @@ We considered several technologies which could be used:
|
|
|
|
|
the features we need when some additional extensions are installed (to support
|
|
|
|
|
LDAP or ZeroMQ).
|
|
|
|
|
- ASP.NET (C#), JSP (Java) -- these technologies are very robust and are used to
|
|
|
|
|
create server technologies in many big enterpises. Both can run on Windows and
|
|
|
|
|
Linux servers (ASP.NET using the .NET Core).
|
|
|
|
|
create server technologies in many big enterprises. Both can run on Windows
|
|
|
|
|
and Linux servers (ASP.NET using the .NET Core).
|
|
|
|
|
- JavaScript (Node.js) -- it is a quite new technology and it is being used to
|
|
|
|
|
create REST APIs lately. Applications running on Node.js are quite performant
|
|
|
|
|
and the number of open-source libraries available on the Internet is very
|
|
|
|
|