|
|
@ -1,10 +1,10 @@
|
|
|
|
# Introduction
|
|
|
|
# Introduction
|
|
|
|
|
|
|
|
|
|
|
|
Generally, there are many different ways and opinions on how to teach people
|
|
|
|
In general there are many different ways and opinions on how to teach people
|
|
|
|
something new. However, most people agree that a hands-on experience is one of
|
|
|
|
something new. However, most people agree that a hands-on experience is one of
|
|
|
|
the best ways to make the human brain remember a new skill. Learning must be
|
|
|
|
the best ways to make the human brain remember a new skill. Learning must be
|
|
|
|
entertaining and interactive, with fast and frequent feedback. Some kinds of
|
|
|
|
entertaining and interactive, with fast and frequent feedback. Some areas
|
|
|
|
knowledge are more suitable for this practical type of learning than others, and
|
|
|
|
are more suitable for this practical way of learning than others, and
|
|
|
|
fortunately, programming is one of them.
|
|
|
|
fortunately, programming is one of them.
|
|
|
|
|
|
|
|
|
|
|
|
University education system is one of the areas where this knowledge can be
|
|
|
|
University education system is one of the areas where this knowledge can be
|
|
|
@ -12,58 +12,62 @@ applied. In computer programming, there are several requirements a program
|
|
|
|
should satisfy, such as the code being syntactically correct, efficient and easy
|
|
|
|
should satisfy, such as the code being syntactically correct, efficient and easy
|
|
|
|
to read, maintain and extend.
|
|
|
|
to read, maintain and extend.
|
|
|
|
|
|
|
|
|
|
|
|
Checking programs written by students takes time and requires a lot of
|
|
|
|
Checking programs written by students by hand takes time and requires a lot of
|
|
|
|
mechanical, repetitive work -- reviewing source codes, compiling them and
|
|
|
|
repetitive work -- reviewing source codes, compiling them and
|
|
|
|
running them through testing scenarios. It is therefore desirable to automate as
|
|
|
|
running them through test scenarios. It is therefore desirable to automate as
|
|
|
|
much of this work as possible. The first idea of an automatic evaluation system
|
|
|
|
much of this process as possible.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The first idea of an automatic evaluation system
|
|
|
|
comes from Stanford University professors in 1965. They implemented a system
|
|
|
|
comes from Stanford University professors in 1965. They implemented a system
|
|
|
|
which evaluated code in Algol submitted on punch cards. In following years, many
|
|
|
|
which evaluated code in Algol submitted on punch cards. In following years, many
|
|
|
|
similar products were written.
|
|
|
|
similar products were written.
|
|
|
|
|
|
|
|
|
|
|
|
In the world of today, properties like correctness and efficiency can be tested
|
|
|
|
Nowadays properties like correctness and efficiency can be tested
|
|
|
|
automatically to a large extent. This fact should be exploited to help teachers
|
|
|
|
to a large extent automatically. This fact should be exploited to help teachers
|
|
|
|
save time for tasks such as examining bad design, bad coding habits and logical
|
|
|
|
save time for tasks such as examining bad design, bad coding habits, or logical
|
|
|
|
mistakes, which are difficult to perform automatically.
|
|
|
|
mistakes, which are difficult to perform automatically.
|
|
|
|
|
|
|
|
|
|
|
|
There are two basic ways of automatically evaluating code -- statically
|
|
|
|
There are two basic ways of automatically evaluating code:
|
|
|
|
(checking the source code without running it; safe, but not very precise) or
|
|
|
|
|
|
|
|
dynamically (running the code on test inputs and checking the correctness of
|
|
|
|
- **statically** -- by checking the source code without running it.
|
|
|
|
outputs ones; provides good real world experience, but requires extensive
|
|
|
|
This is safe, but not very precical.
|
|
|
|
security measures).
|
|
|
|
- **dynamically** -- by running the code on test inputs and checking the correctness of
|
|
|
|
|
|
|
|
outputs ones. This provides good real world experience, but requires extensive
|
|
|
|
|
|
|
|
security measures).
|
|
|
|
|
|
|
|
|
|
|
|
This project focuses on the machine-controlled part of source code evaluation.
|
|
|
|
This project focuses on the machine-controlled part of source code evaluation.
|
|
|
|
First, general concepts of grading systems are observed and problems of the
|
|
|
|
First we observed the general concepts of grading systems and discussed the problems of the
|
|
|
|
software previously used at Charles University in Prague are briefly discussed.
|
|
|
|
software previously used at the Charles University in Prague.
|
|
|
|
Then new requirements are specified and projects with similar functionality are
|
|
|
|
Then the new requirements were specified and we examined projects with similar functionality.
|
|
|
|
examined. With acquired knowledge from such projects in production, we set up
|
|
|
|
With the acquired knowledge from these projects, we set up
|
|
|
|
goals for the new evaluation system, designed the architecture and implemented a
|
|
|
|
goals for the new evaluation system, designed the architecture and implemented a
|
|
|
|
fully operational solution based on dynamic evaluation. The system is now ready
|
|
|
|
fully operational solution based on dynamic evaluation. The system is now ready
|
|
|
|
for production testing at the university.
|
|
|
|
for production testing at the university.
|
|
|
|
|
|
|
|
|
|
|
|
## Assignment
|
|
|
|
## Assignment
|
|
|
|
|
|
|
|
|
|
|
|
The major goal of this project is to create a grading application that will be
|
|
|
|
The major goal of this project is to create a grading application which will be
|
|
|
|
used for programming classes at the Faculty of Mathematics and Physics of the
|
|
|
|
used for programming classes at the Faculty of Mathematics and Physics of the
|
|
|
|
Charles University in Prague. However, the application should be designed in a
|
|
|
|
Charles University in Prague. However, the application should be designed in a
|
|
|
|
modular fashion to be easily extended or even modified to make other ways of
|
|
|
|
modular fashion to be easily extended or even modified to make other ways of
|
|
|
|
usage possible.
|
|
|
|
usage possible.
|
|
|
|
|
|
|
|
|
|
|
|
The system should be capable of dynamic analysis of submitted source codes. This
|
|
|
|
The system should be capable of doing a dynamic analysis of the submitted source
|
|
|
|
consists of following basic steps:
|
|
|
|
codes. This consists of the following basic steps:
|
|
|
|
|
|
|
|
|
|
|
|
1. compile the code and check for compilation errors
|
|
|
|
1. compile the code and check for compilation errors
|
|
|
|
2. run compiled binary in a sandbox with predefined inputs
|
|
|
|
2. run the compiled program in a sandbox with predefined inputs
|
|
|
|
3. check constraints on used amount of memory and time
|
|
|
|
3. check the constraints of the amount of used memory and time
|
|
|
|
4. compare program outputs with predefined values
|
|
|
|
4. compare the outputs of the program with the defined expected outputs
|
|
|
|
5. award the code with a numeric score
|
|
|
|
5. award the solution with a numeric score
|
|
|
|
|
|
|
|
|
|
|
|
The whole system is intended to help both teachers (supervisors) and students.
|
|
|
|
The whole system is intended to help both the teachers (supervisors) and the students.
|
|
|
|
To achieve this, it is crucial to keep in mind the typical usage scenarios of
|
|
|
|
To achieve this, it is crucial for us to keep in mind the typical usage scenarios of
|
|
|
|
the system and to try to make these tasks as simple as possible. To fulfil this
|
|
|
|
the system and to try to make these tasks as simple as possible. To fulfill this
|
|
|
|
task, the project has a great starting point -- there is an old grading system
|
|
|
|
task, the project has a great starting point -- there is an old grading system
|
|
|
|
currently used at the university (CodEx), so its flaws and weaknesses can be
|
|
|
|
currently used at the university (CodEx), so its flaws and weaknesses can be
|
|
|
|
addressed. Furthermore, many teachers desire to use and test the new system and
|
|
|
|
addressed. Furthermore, many teachers desire to use and test the new system and
|
|
|
|
they are willing to consult ideas or problems during development with us.
|
|
|
|
they are willing to consult our ideas or problems during the development with us.
|
|
|
|
|
|
|
|
|
|
|
|
## Current System
|
|
|
|
## Current System
|
|
|
|
|
|
|
|
|
|
|
@ -71,8 +75,8 @@ The grading solution currently used at the Faculty of Mathematics and Physics of
|
|
|
|
the Charles University in Prague was implemented in 2006 by a group of students.
|
|
|
|
the Charles University in Prague was implemented in 2006 by a group of students.
|
|
|
|
It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
|
|
|
|
It is called [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/)
|
|
|
|
and it has been used with some improvements since then. The original plan was to
|
|
|
|
and it has been used with some improvements since then. The original plan was to
|
|
|
|
use the system only for basic programming courses, but there was a demand for
|
|
|
|
use the system only for the basic programming courses, but there was a demand for
|
|
|
|
adapting it for many different subjects.
|
|
|
|
adapting it for several different courses.
|
|
|
|
|
|
|
|
|
|
|
|
CodEx is based on dynamic analysis. It features a web-based interface, where
|
|
|
|
CodEx is based on dynamic analysis. It features a web-based interface, where
|
|
|
|
supervisors can assign exercises to their students and the students have a time
|
|
|
|
supervisors can assign exercises to their students and the students have a time
|
|
|
@ -86,8 +90,8 @@ corresponds to his/her privileges. There are user groups reflecting the
|
|
|
|
structure of lectured courses.
|
|
|
|
structure of lectured courses.
|
|
|
|
|
|
|
|
|
|
|
|
A database of exercises (algorithmic problems) is another part of the project.
|
|
|
|
A database of exercises (algorithmic problems) is another part of the project.
|
|
|
|
Each exercise consists of a text describing the problem, an evaluation
|
|
|
|
Each exercise consists of a text describing the problem, a configuration of the
|
|
|
|
configuration (machine-readable instructions on how to evaluate solutions to the
|
|
|
|
evaluation (machine-readable instructions on how to evaluate solutions to the
|
|
|
|
exercise), time and memory limits for all supported runtimes (e.g. programming
|
|
|
|
exercise), time and memory limits for all supported runtimes (e.g. programming
|
|
|
|
languages), a configuration for calculating the final score and a set of inputs
|
|
|
|
languages), a configuration for calculating the final score and a set of inputs
|
|
|
|
and reference outputs. Exercises are created by instructed privileged users.
|
|
|
|
and reference outputs. Exercises are created by instructed privileged users.
|
|
|
@ -96,30 +100,30 @@ and specifying additional properties: a deadline (optionally a second deadline),
|
|
|
|
a maximum amount of points, a maximum number of submissions and a list of
|
|
|
|
a maximum amount of points, a maximum number of submissions and a list of
|
|
|
|
supported runtime environments.
|
|
|
|
supported runtime environments.
|
|
|
|
|
|
|
|
|
|
|
|
Typical use cases for supported user roles are following:
|
|
|
|
The typical use cases for the user roles are the following:
|
|
|
|
|
|
|
|
|
|
|
|
- **student**
|
|
|
|
- **student**
|
|
|
|
- create new user account via registration form
|
|
|
|
- create a new user account via a registration form
|
|
|
|
- join a group
|
|
|
|
- join groups (e.g., the courses he attends)
|
|
|
|
- get assignments in group
|
|
|
|
- get assignments in the groups
|
|
|
|
- submit solution to assignment -- upload one source file and trigger
|
|
|
|
- submit a solution to an assignment -- upload one source file and start the
|
|
|
|
evaluation process
|
|
|
|
evaluation process
|
|
|
|
- view solution results -- which parts succeeded and failed, total number of
|
|
|
|
- view the results of the solution -- which parts succeeded and failed, the total
|
|
|
|
acquired points, bonus points
|
|
|
|
number of the acquired points, bonus points
|
|
|
|
- **supervisor** (similar to CodEx **operator**)
|
|
|
|
- **supervisor** (similar to CodEx *operator*)
|
|
|
|
- create exercise -- create description text and evaluation configuration
|
|
|
|
- create a new exercise -- create description text and evaluation configuration
|
|
|
|
(for each programming environment), upload testing inputs and outputs
|
|
|
|
(for each programming environment), upload testing inputs and outputs
|
|
|
|
- assign exercise to group -- choose exercise and set deadlines, number of
|
|
|
|
- assign an exercise to a group -- choose an exercise and set the deadlines,
|
|
|
|
allowed submissions, weights of all testing cases and amount of points for
|
|
|
|
the number of allowed submissions, the weights of all test cases and the amount
|
|
|
|
correct solutions
|
|
|
|
of points for the correct solutions
|
|
|
|
- modify assignment
|
|
|
|
- modify an assignment
|
|
|
|
- view all results in group
|
|
|
|
- view all of the results of the students in a group
|
|
|
|
- check automatic solution grading -- view submitted source and optionally
|
|
|
|
- review the automatic solution evaluation -- view the submitted source files
|
|
|
|
set bonus points
|
|
|
|
and optionally set bonus points (including negative points)
|
|
|
|
- **administrator**
|
|
|
|
- **administrator**
|
|
|
|
- create groups
|
|
|
|
- create groups
|
|
|
|
- alter user privileges -- make supervisor accounts
|
|
|
|
- alter user privileges -- make supervisor accounts
|
|
|
|
- check system logs, upgrades and other management
|
|
|
|
- check system logs
|
|
|
|
|
|
|
|
|
|
|
|
### Exercise Evaluation Chain
|
|
|
|
### Exercise Evaluation Chain
|
|
|
|
|
|
|
|
|
|
|
|