introduction and intended usage polishing

master
Teyras 8 years ago
parent fbcfe96261
commit 543794bd49

@ -61,22 +61,28 @@ knowledge are more suitable for this practical type of learning than others, and
fortunately, programming is one of them. fortunately, programming is one of them.
University education system is one of the areas where this knowledge can be University education system is one of the areas where this knowledge can be
applied. In computer programming, there are several requirements such as the applied. In computer programming, there are several requirements a program
code being syntactically correct, efficient and easy to read, maintain and should satify, such as the code being syntactically correct, efficient and easy
extend. Correctness and efficiency can be tested automatically to help teachers to read, maintain and extend.
save time for their research, but reviewing bad design, bad coding habits and
logical mistakes is really hard to automate and requires manpower. Checking programs written by students takes time and requires a lot of
mechanical, repetitive work -- reviewing source codes, compiling them and
Checking programs written by students takes a lot of time and requires a lot of running them through testing scenarios. It is therefore desirable to automate as
mechanical, repetitive work. The first idea of an automatic evaluation system much of this work as possible. The first idea of an automatic evaluation system
comes from Stanford University professors in 1965. They implemented a system comes from Stanford University professors in 1965. They implemented a system
which evaluated code in Algol submitted on punch cards. In following years, many which evaluated code in Algol submitted on punch cards. In following years, many
similar products were written. similar products were written.
There are two basic ways of automatically evaluating code -- statically (check In today's world, properties like correctness and efficiency can be tested
the code without running it; safe, but not very precise) or dynamically (run the automatically to a large extent. This fact should be exploited to help teachers
code on testing inputs with checking the outputs against reference ones; needs save time for tasks such as examining bad design, bad coding habits and logical
sandboxing, but provides good real world experience). mistakes, which are difficult to perform automatically.
There are two basic ways of automatically evaluating code -- statically
(checking the sourcecode without running it; safe, but not very precise) or
dynamically (running the code on test inputs and checking the correctness of
outputs ones; provides good real world experience, but requires extensive
security measures).
This project focuses on the machine-controlled part of source code evaluation. This project focuses on the machine-controlled part of source code evaluation.
First, general concepts of grading systems are observed, new requirements are First, general concepts of grading systems are observed, new requirements are
@ -84,8 +90,8 @@ specified and project with similar functionality are examined. Also, problems of
the software previously used at Charles University in Prague are briefly the software previously used at Charles University in Prague are briefly
discussed. With acquired knowledge from such projects in production, we set up discussed. With acquired knowledge from such projects in production, we set up
goals for the new evaluation system, designed the architecture and implemented a goals for the new evaluation system, designed the architecture and implemented a
fully operational solution. The system is now ready for production testing at fully operational solution based on dynamic evaluation. The system is now ready
the university. for production testing at the university.
## Assignment ## Assignment
@ -95,13 +101,14 @@ Charles University in Prague. However, the application should be designed in a
modular fashion to be easily extended or even modified to make other ways of modular fashion to be easily extended or even modified to make other ways of
usage possible. usage possible.
The system should be capable of dynamic analysis of programming code. It means, The system should be capable of dynamic analysis of submitted source codes. This
that following four basic steps have to be supported: consists of following basic steps:
1. compile the code and check for compilation errors 1. compile the code and check for compilation errors
2. run compiled binary in a sandbox with predefined inputs 2. run compiled binary in a sandbox with predefined inputs
3. check constraints on used amount of memory and time 3. check constraints on used amount of memory and time
4. compare program outputs with predefined values 4. compare program outputs with predefined values
5. award the code with a numeric score
The project has a great starting point -- there is an old grading system The project has a great starting point -- there is an old grading system
currently used at the university (CodEx), so its flaws and weaknesses can be currently used at the university (CodEx), so its flaws and weaknesses can be
@ -111,14 +118,14 @@ they are willing to consult ideas or problems during development with us.
### Intended usage ### Intended usage
The whole system is intended to help both teachers (supervisors) and students. The whole system is intended to help both teachers (supervisors) and students.
To achieve this, it is crucial to keep in mind typical usage scenarios of the To achieve this, it is crucial to keep in mind the typical usage scenarios of
system and try to make these tasks as simple as possible. the system and to try to make these tasks as simple as possible.
The system has a database of users. Each user has assigned a role, which The system has a database of users. Each user is assigned a role, which
corresponds to his/her privileges. There are user groups reflecting structure of corresponds to his/her privileges. There are user groups reflecting the
lectured courses. Groups can be hierarchically ordered to reflect additional structure of lectured courses. Groups can be hierarchically ordered to reflect
metadata such as the academic year. For example, a reasonable group hierarchy additional metadata such as the academic year. For example, a reasonable group
can look like this: hierarchy could look like this:
``` ```
Summer term 2016 Summer term 2016
@ -130,22 +137,22 @@ Summer term 2016
... ...
``` ```
In this example, students are members of the leaf groups, the higher level In this example, students are members of the leaf groups and the higher level
entities are just for keeping the related groups together. The hierarchy nodes are just for keeping related groups together. The structure can be
structure can be modified and altered to fit specific needs of the university or modified and altered to fit specific needs of the university or any other
any other organization, even the flat structure (i.e., no hierarchy) is organization, even a flat structure is possible. One user can be a member of
possible. One user can be part of multiple groups and on the other hand one multiple groups and have a different role in each of them (a student can attend
group can have multiple users. Each user can have a specific role for every labs for several courses while also teaching one).
group in which is a member, overriding his/her default role in this context.
A database of exercises (algorithmic problems) is another part of the project.
Database of exercises (algorithmic problems) is another part of the project. Each exercise consists of a text describing the problem in multiple language
Each exercise consists of a text in multiple language variants, an evaluation variants, an evaluation configuration (machine-readable instructions on how to
configuration and a set of inputs and reference outputs. Exercises are created evaluate solutions to the exercise) and a set of inputs and reference outputs.
by instructed privileged users. Assigning an exercise to a group means to Exercises are created by instructed privileged users. Assigning an exercise to a
choose one of the available exercises and specifying additional properties. An group means choosing one of the available exercises and specifying additional
assignment has a deadline (optionally a second deadline), a maximum amount of properties: a deadline (optionally a second deadline), a maximum amount of
points, a configuration for calculating the final score, a maximum number of points, a configuration for calculating the score, a maximum number of
submissions, and a list of supported runtime environments (e.g., programming submissions, and a list of supported runtime environments (e.g. programming
languages) including specific time and memory limits for each one. languages) including specific time and memory limits for each one.
Typical use cases for supported user roles are illustrated on following UML Typical use cases for supported user roles are illustrated on following UML
@ -161,32 +168,29 @@ is described in more detail below to give readers solid overview of what have to
happen during evaluation process. happen during evaluation process.
First thing users have to do is to submit their solutions through some user First thing users have to do is to submit their solutions through some user
interface. Then, the system checks assignment invariants (deadlines, count interface. Then, the system checks assignment invariants (deadlines, count of
of submissions, ...) and stores submitted files. The runtime environment is submissions, ...) and stores submitted files. The runtime environment is
automatically detected based on input files and suitable exercise configuration automatically detected based on input files and a suitable evaluation
variant is chosen (one exercise can have multiple variants, for example C and configuration variant is chosen (one exercise can have multiple variants, for
Java languages). Matching exercise configuration is then used for taking care of example C and Java languages). This exercise configuration is then used for
evaluation process. taking care of evaluation process.
There is a pool of worker computers dedicated to processing jobs. Some of them There is a pool of worker computers dedicated to evaluation jobs. Each one of
may have different environment to allow testing programs in more conditions. them can support different environments and programming languages to allow
Incoming jobs are scheduled to particular worker depending on its capabilities testing programs for as many platforms as possible. Incoming jobs are scheduled
and job requirements. to a worker that is capable of running the job.
Job processing itself starts with obtaining source files and job configuration. The worker obtains the solution and its evaluation configuration, parses it and
The configuration is parsed into small tasks with simple piece of work. starts executing the contained instructions. It is crucial to keep the worker
Evaluation itself goes in direction of tasks ordering. It is crucial to keep computer secure and stable, so a sandboxed environment is used for dealing with
executive computer secure and stable, so isolated sandboxed environment is used unknown source code. When the execution is finished, results are saved and the
when dealing with unknown source code. When the execution is finished, results submitter is notified.
are saved.
The output of the worker contains data about the evaluation, such as time and
Results from worker contains only output data from processed tasks (this could memory spent on running the program for each test input and whether its output
be return value, consumed time, ...). On top of that, one value is calculated to was correct. The system then calculates a numeric score from this data, which is
express overall quality of the tested job. It is used as points for final presented to the student. If the solution is wrong (incorrect output, uses too
student grading. Calculation method of this value may be different for each much memory,..), error messages are also displayed to the submitter.
assignment. Data presented back to users include overview of job parts (which
succeeded and which failed, optionally with reason like "memory limit exceeded")
and achieved score (amount of awarded points).
## Requirements ## Requirements

Loading…
Cancel
Save