From 4c9dac022fd63c001529148598efa02b56ce71ec Mon Sep 17 00:00:00 2001 From: Petr Stefan Date: Thu, 5 Jan 2017 23:46:57 +0100 Subject: [PATCH] Intro update --- Rewritten-docs.md | 241 +++++++++++++++++++++++----------------------- 1 file changed, 122 insertions(+), 119 deletions(-) diff --git a/Rewritten-docs.md b/Rewritten-docs.md index bacd44f..55e5d7c 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -79,41 +79,130 @@ code on testing inputs with checking the outputs against reference ones; needs sandboxing, but provides good real world experience). This project focuses on the machine-controlled part of source code evaluation. -First, the problems of the software used at Charles University in Prague -previously were discussed and similar projects at other educational institutions -were examined. With acquired knowledge from such projects in production, we set -up goals for the new evaluation system, designed the architecture and -implemented a fully operational solution. The system is now ready for production -testing at our university. +First, general concepts of grading systems are observed, new requirements are +specified and project with similar functionality are examined. Also, problems of +the software previously used at Charles University in Prague are briefly +discussed. With acquired knowledge from such projects in production, we set up +goals for the new evaluation system, designed the architecture and implemented a +fully operational solution. The system is now ready for production testing at +the university. ## Assignment The major goal of this project is to create a grading application that will be used for programming classes at the Faculty of Mathematics and Physics of the Charles University in Prague. However, the application should be designed in a -modular fashion so that it can be easily extended or modified to make other ways -of using it possible. +modular fashion to be easily extended ori even modified to make other ways of +usage possible. + +The system should be capable of dynamic analysis of programming code. It means, +that following four basic steps have to be supported: + +1. compile the code and check for compilation errors +2. run compiled binary in a sandbox with predefined inputs +3. check constraints on used amount of memory and time +4. compare program outpus with predefined values The project has a great starting point -- there is an old grading system currently used at the university (CodEx), so its flaws and weaknesses can be -addressed. Furthermore, many teachers are willing to use and test the new -system. Following requirements were collected both from our personal experience -with CodEx and from teachers' requests. +addressed. Furthermore, many teachers desire to use and test the new system and +they are willing to consult ideas or problems during development. + +### Intended usage + +The whole system is intended to help both teachers (supervisors) and students. +To achieve this, it is crucial to keep in mind typical usage scenarios of the +system and try to make these tasks as simple as possible. + +The system has a database of users. Each user has assigned one role, which +corresponds to his/her privileges. There are user groups reflecting structure of +lectured courses. Groups can be hierarchically ordered to reflect additional +metadata such as the academic year. For example, a reasonable group hierarchy +can look like this: + +``` +Summer term 2016 +|-- Language C# and .NET platform +|   |-- Labs Monday 10:30 +|   `-- Labs Thursday 9:00 +|-- Programming I +|   |-- Labs Monday 14:00 + ... +``` + +In this example, students are members of the leaf groups, the higher level +entities are just for keeping the related groups together. The hierarchy +structure can be modified and altered to fit specific needs of the university or +any other organization, even the flat structure (i.e., no hierarchy) is +possible. One user can be part of multiple groups and on the other hand one +group can have multiple users. Each user has a specific role for every group in +which is a member. + +Database of exercises (algorithmic problems) is another part of the project. +Each exercise consists of a text in multiple language variants, an evaluation +configuration and a set of inputs and reference outputs. Exercises are created +by instructed priviledged users. Assigning an exercise to a group means to +choose one of the available exercises and specifying additional properties. An +assignment has a deadline (optionally a second deadline), a maximum amount of +points, a configuration for calculating the final score, a maximum number of +submissions, and a list of supported runtime environemnts (e.g., programming +languages) including specific time and memory limits for the sandboxed tasks. + +Typical use cases for supported user roles are ilustrated on following picture: + +@todo: UML use case diagram (and improve or delete following paragraph) + +Priviledged user (supervisor) can create exercise, assign it in his/her group, +change assignment details, view results of his/her students and manually alter +them. Normal user (student) can join a group, get list of assigned exercises, +view assignment detail, submit his/her solution and view the results of the +evaluation. + +#### Exercise evaluation chain + +The most important part of the system is evaluation of solutions submitted by +students. Concepts of consecutive steps from source code to final results +is described in more detail below to give readers solid overview of what have to +happen during evaluation process. + +First thing users have to do is to submit their solutions through some user +interface. Then, the system checks assignment invariants (deadlines, count +of submissions, ...) and stores submitted files. The runtime environment is +automatically detected based on input files and suitable exercise configuration +variant is chosen (one exercise can have multiple variants, for example C and +Java languages). Matching exercise configuration is then used for taking care of +evaluation process. + +There is a pool of worker computers dedicated to processing jobs. Some of them +may have different environment to allow testing programs in different +conditions. Incoming jobs are scheduled to particular worker depending on its +capabilities and job requirements. + +Job processing itself stars with obtaining source files and job configuration. +The configuration is parsed into small tasks with simple piece of work. +Evaluation itself goes in direction of tasks ordering. It is crucial to keep +executive computer secure and stable, so isolated sandboxed environment is used +when dealing with unknown source code. When the execution is finished, results +are saved. + +Results from worker contains only output data from processed tasks (this could +be return value, consumed time, ...). On top of that, one value is calculated to +express overall quality of the tested job. It is used as points for final +student grading. Calculation method of this value may be different for each +assignment. Data presented back to users include overview of job parts (which +succeeded and which failed, optionally with reason like "memory limit exceeded") +and achieved score (amount of awarded points). ## Requirements There are bunch of different requirements for the system. Some of them are -features which are necessary for any system for evaluation of programming coding -assignments. Some of them are specific for university deployment and some are -wishes for new features collected for period of CodEx operation. - -CodEx satisfies all the basic requirements and a few more that originate from -the way courses are organized at university environment -- for example students -are divided into groups that correspond to lab groups. New wishes arose during -the ten year long lifetime of the old system. There are not many ways to improve -it from the perspective of a student, but a lot of feature requests came from -administrators and supervisors. The ideas were mostly gathered from meetings -with faculty staff involved with the current system. +necessary for any system for source code evaluation. Some of them are specific +for university deployment and some of them arose during the ten year long +lifetime of the old system. There are not many ways to improve CodEx +experience from the perspective of a student, but a lot of feature requests +came from administrators and supervisors. The ideas were gathered mostly our +personal experience with the system and from meetings with faculty staff +involved with the current system. For clear arragement all the requirements and wishes are presented grouped by categories. @@ -136,9 +225,9 @@ They describe the evaluation system in general and also university addons specific properties set (deadlines, etc.) - there is a list of submitted solutions for each assignment with corresponding results -- teachers can specify scale of points which will be awarted to the students - depending on the correctness of his/her solution for each assignment extra - (expressed in percentage points) +- teachers can specify way of computation grading points which will be awarted + to the students depending on the quality of his/her solution for each + assignment extra - teachers can view detailed data about their students (users of a their groups) including all submitted solutions; also, each of the solution can be manually reviewed, commented and assigned additional points (positive or negative) @@ -158,13 +247,11 @@ They describe the evaluation system in general and also university addons mainly for viewing assigned exercises, uploading their own solutions to the assignments, and viewing the results of the solutions after an automatic evaluation is finished; wanted two interfaces are web and command-line based -- administrators can manage users with support of roles (at least two -- - _student_ and _supervisor_) +- user priviledge separation (at least two roles -- _student_ and _supervisor_) - logging in through a university authentication system (e.g. LDAP) - SIS (university information system) integration for fetching personal user data -- administrators can depend on a safe environment in which the students' - solutions will be executed +- safe environment in which the students' solutions are executed - support for multiple programming environments at once to avoid unacceptable workload for administrator (maintain separate installations for many courses) and high hardware occupation @@ -263,12 +350,12 @@ Valgrind for memory leaks or array boundaries checks via `mudflap` library. ### Codility [Codility](https://codility.com/) is a web based solution primary targeted to -company recruiters. It is a commercial product available as a SaaS and it supports 16 -programming languages. The +company recruiters. It is a commercial product available as a SaaS and it +supports 16 programming languages. The [UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png) -of Codility is [opensource](https://github.com/Codility/cui), the rest of -source code is not available. One interesting feature is 'task timeline' -- -captured progress of writing code for each user. +of Codility is [opensource](https://github.com/Codility/cui), the rest of source +code is not available. One interesting feature is 'task timeline' -- captured +progress of writing code for each user. ### CMS @@ -300,90 +387,6 @@ exercises. Kattis is primarily used by programming contest organizators, company recruiters and also some universities. - -### Intended usage - -The whole system is intended to help both teachers (supervisors) and students. -To achieve this, it is crucial to keep in mind typical usage scenarios of the -system and try to make these typical tasks as simple as possible. - -The system has a database of users. Each user has a role assigned, which -correspond to his/her privileges. User can be logged in via email and password -or using the university system. There are groups of users, which corresponds to -the lectured courses. Groups can be hierarchically ordered to reflect additional -metadata such as the academic year. For example, a reasonable group hierarchy -can look like this: - -``` -Summer term 2016 -|-- Language C# and .NET platform -|   |-- Labs Monday 10:30 -|   `-- Labs Thursday 9:00 -|-- Programming I -|   |-- Labs Monday 14:00 - ... - -``` - -In this example, students are members of the leaf groups, the higher level -groups are just for keeping the related groups together. The hierarchy tree can -be modified and altered to fit specific needs of the university or any other -organization, even the flat structure (i.e., no hierarchy) is possible. - -One user can be part of multiple groups and also one group can of course have -multiple users. Each user in a group has also a specific role for the given -group. Priviledged user (supervisor) can assign a new exercise in his/her -group, change assignment details, view results of other users and manually -change them. Normal user (student) can join a group, get list of assigned -exercises, view assignment detail, submit his/her solution and view the results -of the evaluation. - -Database of exercises (algorithmic problems) is another part of the project. -Each exercise consists of a text in multiple language variants, an evaluation -configuration and a set of inputs and reference outputs. Exercises are created -by instructed priviledged users. Assigning an exercise to a group means to -choose one of the available exercises and specifying additional properties. An -assignment has a deadline (optionally a second deadline), a maximum amount of -points, a configuration for calculating the final score, a maximum number of -submissions, and a list of supported runtime environemnts (e.g., programming -languages) including specific time and memory limits for the sandboxed tasks. - -#### Exercise evaluation chain - -The most important part of the system is the evaluation of the solutions -submitted by the users for their assigned exercises. Concepts of consecutive -steps from source code of solution to results is described on architecture with -two layer -- presentation (_frontend_) and executive (_backend_). - -First thing users have to do is to submit their solutions to _frontend_ which -provides interface to upload files and then submit them. It checks the -assignment invariants (deadlines, count of submissions, ...) and stores -submitted files. The runtime environment is automatically detected based on -input files and suitable exercise configuration variant is chosen (one exercise -can have multiple variants, for example C and Java languages). Matching exercise -configuration is then send to _backend_ alongside solution source files. - -_Backend_ can have multiple engines to allow processing more jobs in parallel -and a loadbalancer, which tracks states of incoming jobs and performs scheduling -of them. The decission is made based on capabilities of each engine and also job -requirements. When a match is found, the job is held until the particular engine -is jobless and can receive an evaluation request. - -Job processing itself stars with obtaining source files and job configuration. -The configuration is parsed into small tasks with simple piece of work. -Evaluation itself goes in direction of tasks ordering. It is crucial to keep -executive computer secure and stable, so isolated sandboxed environment is used -when dealing with unknown source code. When the execution is finished, results -are uploaded back to _frontend_. - -The _frontend_ is immediately notified about finished job. The outcomes are -parsed and results of important tasks (comparing actual and expected results) -saved into storage. Also, points are calculated depending on solution -correctness and assignment configuration. Data presented back to users includes -overview which part succeeded and which failed (optionally with reason like -"memory limit exceeded") and amount of awarded points. - - # Analysis ## ReCodEx goals @@ -1671,6 +1674,6 @@ used. chdir: ${EVAL_DIR} ```