Structure

master
Petr Stefan 8 years ago
parent 3b520d0e35
commit b5e2750e15

@ -85,10 +85,10 @@ outputs ones; provides good real world experience, but requires extensive
security measures).
This project focuses on the machine-controlled part of source code evaluation.
First, general concepts of grading systems are observed, new requirements are
specified and project with similar functionality are examined. Also, problems of
the software previously used at Charles University in Prague are briefly
discussed. With acquired knowledge from such projects in production, we set up
First, general concepts of grading systems are observed and problems of the
software previously used at Charles University in Prague are briefly discussed.
Then new requirements are specified and projects with similar functionality are
examined. With acquired knowledge from such projects in production, we set up
goals for the new evaluation system, designed the architecture and implemented a
fully operational solution based on dynamic evaluation. The system is now ready
for production testing at the university.
@ -110,7 +110,10 @@ consists of following basic steps:
4. compare program outputs with predefined values
5. award the code with a numeric score
The project has a great starting point -- there is an old grading system
The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind the typical usage scenarios of
the system and to try to make these tasks as simple as possible. To fulfil this
task, the project has a great starting point -- there is an old grading system
currently used at the university (CodEx), so its flaws and weaknesses can be
addressed. Furthermore, many teachers desire to use and test the new system and
they are willing to consult ideas or problems during development with us.
@ -131,10 +134,6 @@ window to submit their solutions. Each solution is compiled and run in sandbox
and memory limits. It supports programs written in C, C++, C#, Java, Pascal,
Python and Haskell.
The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind the typical usage scenarios of
the system and to try to make these tasks as simple as possible.
The system has a database of users. Each user is assigned a role, which
corresponds to his/her privileges. There are user groups reflecting the
structure of lectured courses.
@ -155,18 +154,58 @@ Typical use cases for supported user roles are following:
- **student**
- join a group
- get assignments in group
- submit solution to assignment
- view solution results
- submit solution to assignment -- upload one source file and trigger
evaluation process
- view solution results -- which parts succeeded and failed, total number of
acquired points, bonus points
- **supervisor**
- create exercise
- assign exercise to group, modify assignment
- create exercise -- create description text and evaluation configuration
(for each programming environment), upload testing inputs and outputs
- assign exercise to group -- choose exercise and set deadlines, number of
allowed submissions, weights of all testing cases and amount of points for
correct solutions
- modify assignment
- view all results in group
- alter automatic solution grading
- check automatic solution grading -- view submitted source and optionally
set bonus points
- **administrator**
- create groups
- alter user privileges
- alter user privileges -- make supervisor accounts
- check system logs, upgrades and other management
### Exercise evaluation chain
The most important part of the system is evaluation of solutions submitted by
students. Concepts of consecutive steps from source code to final results
is described in more detail below to give readers solid overview of what have to
happen during evaluation process.
First thing users have to do is to submit their solutions through web user
interface. The system checks assignment invariants (deadlines, count of
submissions, ...) and stores submitted file. The runtime environment is
automatically detected based on input file and a suitable evaluation
configuration variant is chosen (one exercise can have multiple variants, for
example C and Java languages). This exercise configuration is then used for
taking care of evaluation process.
There is a pool of uniform worker engines dedicated to evaluation jobs. Incoming
jobs are kept in a queue until a free worker picks them. Worker is capable of
sequential evaluation of jobs, one at a time.
The worker obtains the solution and its evaluation configuration, parses it and
starts executing the contained instructions. It is crucial to keep the worker
computer secure and stable, so a sandboxed environment is used for dealing with
unknown source code. When the execution is finished, results are saved and the
submitter is notified.
The output of the worker contains data about the evaluation, such as time and
memory spent on running the program for each test input and whether its output
was correct. The system then calculates a numeric score from this data, which is
presented to the student. If the solution is wrong (incorrect output, uses too
much memory,..), error messages are also displayed to the submitter.
### Weaknesses
Current system is old, but robust. There were no major security incidents
during its production usage. However, from today's perspective there are
several drawbacks. The main ones are:
@ -193,37 +232,6 @@ several drawbacks. The main ones are:
which have a more difficult evaluation chain than simple
compilation/execution/evaluation provided by CodEx.
### Exercise evaluation chain
The most important part of the system is evaluation of solutions submitted by
students. Concepts of consecutive steps from source code to final results
is described in more detail below to give readers solid overview of what have to
happen during evaluation process.
First thing users have to do is to submit their solutions through some user
interface. Then, the system checks assignment invariants (deadlines, count of
submissions, ...) and stores submitted files. The runtime environment is
automatically detected based on input files and a suitable evaluation
configuration variant is chosen (one exercise can have multiple variants, for
example C and Java languages). This exercise configuration is then used for
taking care of evaluation process.
There is a pool of worker computers dedicated to evaluation jobs. Each one of
them can support different environments and programming languages to allow
testing programs for as many platforms as possible. Incoming jobs are scheduled
to a worker that is capable of running the job.
The worker obtains the solution and its evaluation configuration, parses it and
starts executing the contained instructions. It is crucial to keep the worker
computer secure and stable, so a sandboxed environment is used for dealing with
unknown source code. When the execution is finished, results are saved and the
submitter is notified.
The output of the worker contains data about the evaluation, such as time and
memory spent on running the program for each test input and whether its output
was correct. The system then calculates a numeric score from this data, which is
presented to the student. If the solution is wrong (incorrect output, uses too
much memory,..), error messages are also displayed to the submitter.
## Requirements
@ -292,9 +300,9 @@ addons (mostly administrative features).
another tool and perform additional tests
- use of modern technologies with state-of-the-art compilers
### Nonfunctional requirements
### Non-functional requirements
Nonfunctional requirements are requirements of technical character with no
Non-functional requirements are requirements of technical character with no
direct mapping to visible parts of system. In ideal word, users should not know
about these if they work properly, but would be at least annoyed if these
requirements were not met. Most notably they are these ones:
@ -317,14 +325,14 @@ extendable, so everyone can develop their own feature. This also means that
widely used programming languages and techniques should be used, so users can
quickly understand the code and make changes.
## Related work
To find out the current state in the field of automatic grading systems we did a
short market survey on the field of automatic grading systems at universities,
programming contests, and possibly other places where similar tools are
available.
## Related work
This is not a complete list of available evaluators, but only a few projects
which are used these days and can be an inspiration for our project. Each
project from the list has a brief description and some key features mentioned.

Loading…
Cancel
Save