diff --git a/.~lock.Rewritten-docs.md# b/.~lock.Rewritten-docs.md# deleted file mode 100644 index b2d7617..0000000 --- a/.~lock.Rewritten-docs.md# +++ /dev/null @@ -1 +0,0 @@ -,petr,felicity,18.01.2017 13:51,file:///home/petr/.config/libreoffice/4; \ No newline at end of file diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 69692bc..4dfdc6f 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -139,19 +139,21 @@ corresponds to his/her privileges. There are user groups reflecting the structure of lectured courses. A database of exercises (algorithmic problems) is another part of the project. -Each exercise consists of a text describing the problem in multiple language -variants, an evaluation configuration (machine-readable instructions on how to -evaluate solutions to the exercise) and a set of inputs and reference outputs. -Exercises are created by instructed privileged users. Assigning an exercise to a -group means choosing one of the available exercises and specifying additional -properties: a deadline (optionally a second deadline), a maximum amount of -points, a configuration for calculating the score, a maximum number of -submissions, and a list of supported runtime environments (e.g. programming -languages) including specific time and memory limits for each one. +Each exercise consists of a text describing the problem (optionally in two +language variants -- Czech and English), an evaluation configuration +(machine-readable instructions on how to evaluate solutions to the exercise) and +a set of inputs and reference outputs. Exercises are created by instructed +privileged users. Assigning an exercise to a group means choosing one of the +available exercises and specifying additional properties: a deadline (optionally +a second deadline), a maximum amount of points, a configuration for calculating +the score, a maximum number of submissions, and a list of supported runtime +environments (e.g. programming languages) including specific time and memory +limits for each one. Typical use cases for supported user roles are following: - **student** + - create new user account via registration form - join a group - get assignments in group - submit solution to assignment -- upload one source file and trigger @@ -180,10 +182,10 @@ students. Concepts of consecutive steps from source code to final results is described in more detail below to give readers solid overview of what have to happen during evaluation process. -First thing users have to do is to submit their solutions through web user +First thing students have to do is to submit their solutions through web user interface. The system checks assignment invariants (deadlines, count of -submissions, ...) and stores the submitted file. The runtime environment is -automatically detected based on input file and a suitable evaluation +submissions, ...) and stores the submitted code. The runtime environment is +automatically detected based on input file extension and a suitable evaluation configuration variant is chosen (one exercise can have multiple variants, for example C and Java languages). This exercise configuration is then used for taking care of evaluation process. @@ -449,20 +451,18 @@ restarted. At this point there is a clear idea how the new system will be used and what are the major enhancements for future releases. With this in mind, the overall -architecture can be sketched. From the previous research, several goals are set -up for the new project. They mostly reflect drawbacks of the current version of -CodEx and some reasonable wishes of university users. Most notable features are -following: +architecture can be sketched. To sum up, here is a list of key features of the +new system. They come from previous research of current system's drawbacks, +reasonable wishes of university users and our major design choices. - modern HTML5 web frontend written in JavaScript using a suitable framework -- REST API implemented in PHP, communicating with database, evaluation backend - and a file server +- REST API communicating with database, evaluation backend and a file server - evaluation backend implemented as a distributed system on top of a message - queue framework (ZeroMQ) with master-worker architecture + queue framework with master-worker architecture - multi-platform worker supporting Linux and Windows environment (latter without sandbox, no general purpose suitable tool available yet) -- evaluation procedure configured in a YAML file, compound of small tasks - connected into an arbitrary oriented acyclic graph +- evaluation procedure configured in a human readable text file, compound of + small tasks connected into an arbitrary oriented acyclic graph The reasons supporting these decisions are explained in the rest of analysis chapter. Also a lot of smaller design choices are mentioned including possible @@ -532,18 +532,18 @@ is implemented. The relative value is set in percents and is called threshold. Our university has a few partner grammar schools. There were an idea, that they could use CodEx for teaching informatics classes. To make the setup simple for -them, all the software and hardware would be provided by university and hosted -in their datacentre. However, CodEx were not prepared to support this kind of -usage and no one had time to manage a separate instance. With ReCodEx it is -possible to offer hosted environment as a service to other subjects. The concept -we figured out is based on user and group separation inside the system. There -are multiple _instances_ in the system, which means unit of separation. Each -instance has own set of users and groups, exercises can be optionally shared. -Evaluation backend is common for all instances. To keep track of active -instances and paying customers, each instance must have a valid _licence_ to -allow users submit their solutions. licence is granted for defined period of -time and can be revoked in advance if the subject do not keep approved terms and -conditions. +them, all the software and hardware would be provided by the university as a +completely ready-to-use remote service. However, CodEx were not prepared to +support this kind of usage and no one had time to manage a separate instance. +With ReCodEx it is possible to offer hosted environment as a service to other +subjects. The concept we figured out is based on user and group separation +inside the system. There are multiple _instances_ in the system, which means +unit of separation. Each instance has own set of users and groups, exercises can +be optionally shared. Evaluation backend is common for all instances. To keep +track of active instances and paying customers, each instance must have a valid +_licence_ to allow users submit their solutions. licence is granted for defined +period of time and can be revoked in advance if the subject do not keep approved +terms and conditions. The main work for the system is to evaluate programming exercises. The exercise is quite similar to homework assignment during school labs. When a homework is @@ -561,36 +561,6 @@ for every assignment of the same exercise. This separation is natural for all users, in CodEx it is implemented in similar way and no other considerable solution was found. -### Forgotten password - -With authentication and some sort of dealing with passwords is related a problem -with forgotten credentials, especially passwords. People easily forget them and -there has to be some kind of mechanism to retrieve a new password or change the -old one. Problem is that it cannot be done in totally secure way, but we can at -least come quite close to it. First, there are absolutely not secure and -recommendable ways how to handle that, for example sending the old password -through email. A better, but still not secure solution is to generate a new one -and again send it through email. This solution was provided in CodEx, users had -to write an email to administrator, who generated a new password and sent it -back to the sender. This simple solution could be also automated, but -administrator had quite a big control over whole process. This might come in -handy if there could be some additional checkups for example, but on the other -hand it can be quite time consuming. - -Probably the best solution which is often used and is fairly secure is -following. Let us consider only case in which all users have to fill their -email addresses into the system and these addresses are safely in the hands of -the right users. When user finds out that he/she does not remember a password, -he/she requests a password reset and fill in his/her unique identifier; it might -be email or unique nickname. Based on matched user account the system generates -unique access token and sends it to user via email address. This token should be -time limited and usable only once, so it cannot be misused. User then takes the -token or URL address which is provided in the email and go to the system's -appropriate section, where new password can be set. After that user can sign in -with his/her new password. As previously stated, this solution is quite safe and -user can handle it on its own, so administrator does not have to worry about it. -That is the main reason why this approach was chosen to be used. - ### Evaluation unit executed by ReCodEx One of the bigger requests for the new system is to support a complex @@ -623,14 +593,23 @@ so no sandbox needs to be used as in external tasks case. For a job evaluation, the tasks needs to be executed sequentially in a specified order. The idea of running independent tasks in parallel is bad because exact time measurement needs controlled environment on target computer with -minimization of interrupts by other processes. It seems that connecting tasks -into directed acyclic graph (DAG) can handle all possible problem cases. None of -the authors, supervisors and involved faculty staff can think of a problem that -cannot be decomposed into tasks connected in a DAG. The goal of evaluation is -to satisfy as many tasks as possible. During execution there are sometimes -multiple choices of next task. To control that, each task can have a priority, -which is used as a secondary ordering criterion. For better understanding, here -is a small example. +minimization of interrupts by other processes. It would be possible to run tasks +which does not need exact time measuremet in parallel, but in this case a +synchronization mechanism has to be developed to exclude paralellism for +measured tasks. Usually, there are about four times more unmeasured tasks than +tasks with time measurement, but measured tasks tends to be much longer. With +[Amdahl's law](https://en.wikipedia.org/wiki/Amdahl's_law) in mind, the +parallelism seems not to provide a huge benefit in overall execution speed and +brings troubles with synchronization. However, it there will be speed issues, +this approach could be reconsiderred. + +It seems that connecting tasks into directed acyclic graph (DAG) can handle all +possible problem cases. None of the authors, supervisors and involved faculty +staff can think of a problem that cannot be decomposed into tasks connected in a +DAG. The goal of evaluation is to satisfy as many tasks as possible. During +execution there are sometimes multiple choices of next task. To control that, +each task can have a priority, which is used as a secondary ordering criterion. +For better understanding, here is a small example. ![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png) @@ -639,20 +618,34 @@ _CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should be deterministic by position in configuration file -- tasks stated earlier should be executed earlier). The task priorities guaranties, that after _CompileA_ task all dependent tasks are executed before _CompileB_ task (they -have higher priority number). For example this is useful to control which files -are present in a working directory at every moment. To sum up, there are 3 -ordering criteria: dependencies, then priorities and finally position of task in -configuration. Together, they define a unambiguous linear ordering of all tasks. +have higher priority number). To sum up, connection of tasks represents +dependencies and priorities can be used to order unrelated tasks and with this +provide a total ordering of them. For well written jobs the priorities may not +be so useful, but they can help control execution order for example to avoid +situation, where each test of the job generates large temporary file and there +is a one valid execution order which keeps all the temporary files for later +processing at one time. Better approach is to finish execution of one test, +clean the big temporary file and proceed with following test. If there is an +ambiguity in task ordering at this point, they are executed in order of input +task configuration. + +The total linear ordering of tasks can be done easier with just executing them +in order of input configuration. But this structure cannot handle well cases, +when a task fails. There is not a easy and nice way how to tell which task +should be executed next. However, this issue can be solved with graph structured +dependencies of the tasks. In graph structure, it is clear that all dependent +tasks has to be skipped and continue execution with a non related task. This is +the main reason, why the tasks are connected in a DAG. For grading there are several important tasks. First, tasks executing submitted code need to be checked for time and memory limits. Second, outputs of judging tasks need to be checked for correctness (represented by return value or by data -on standard output) and should not fail on time or memory limits. This division -can be transparent for backend, each task is executed the same way. But frontend -must know which tasks from whole job are important and what is their kind. It is -reasonable, to keep this piece of information alongside the tasks in job -configuration, so each task can have a label about its purpose. Unlabeled tasks -have an internal type _inner_. There are four categories of tasks: +on standard output) and should not fail. This division can be transparent for +backend, each task is executed the same way. But frontend must know which tasks +from whole job are important and what is their kind. It is reasonable, to keep +this piece of information alongside the tasks in job configuration, so each task +can have a label about its purpose. Unlabeled tasks have an internal type +_inner_. There are four categories of tasks: - _initiation_ -- setting up the environment, compiling code, etc.; for users failure means error in their sources which are not compatible with running it @@ -724,29 +717,21 @@ what kind of reward for users solutions should be chosen. At first let us focus on all kinds of outputs from executed programs within job. Out of discussion is that supervisors should be able to view almost all outputs from solutions if they choose them to be visible and recorded. This feature is -critical in debugging either whole exercises or users solutions. But should it -be default behaviour to record every output? Absolutely not, supervisor should -have a choice to turn it on, but discarding the outputs has to be the default -option. Even without this functionality a file base around whole ReCodEx system -can become quite large and on top of that outputs from executed programs can be -sometimes very extensive. Storing this amount of data is inefficient and -unnecessary to most of the solutions. However, on supervisor request this -feature should be available. - -More interesting question is what should regular users see from execution of -their solution. Simple answer is of course that they should not see anything -which is partly true. Outputs from their programs can be anything and users can -somehow analyze inputs or even redirect them to output. So outputs from -execution should not be visible at all or under very special circumstances. But -that is not so straightforward for compilation or other kinds of initiation, -where it really depends on the particular case. Generally it is quite harmless -to display user some kind of compilation error which can help a lot during -troubleshooting. Of course again this kind of functionality should be -configurable by supervisors and disabled by default. There is also the last kind -of tasks which can output some information which is evaluation tasks. Output of -these tasks is somehow important to whole system and again can contain some -information about inputs or reference outputs. So outputs of evaluation tasks -should not be visible to regular users too. +critical in debugging either whole exercises or users solutions. Supervisor +should have a choice to turn on preserving the data while the default behaviour +is to discard them to keep a file base around whole ReCodEx system in sensible +limits. + +More interesting question is if students should see the logs from execution of +their solution. Usual approach is to keep these information private because of +possibility of leaking input data. This may lead students to hack their +solutions to pass just the ReCodEx testing cases instead of properly solving the +assigned problem. Martin Mareš strongly recommended to use this strategy of +hiding sensitive data too, so ReCodEx does. One exception are compilation +outputs which can help students a lot during troubleshooting. These logs shall +be visible unless the supervisor decides otherwise. Note, that due to lack of +frontend developers, this feature was not implemented in the very first release +of ReCodEx, but will be definitely available in the future. The overall concept of grading solutions was presented earlier. To briefly remind that, backend returns only exact measured values (used time and memory, @@ -799,7 +784,7 @@ factor. There are several ways how to save structured data: - relational database Another important factor is amount and size of stored data. Our guess is about -1000 users, 100 exercises, 200 assignments per year and 400000 unique solutions +1000 users, 100 exercises, 200 assignments per year and 200000 unique solutions per year. The data are mostly structured and there are a lot of them with the same format. For example, there is a thousand of users and each one has the same values -- name, email, age, etc. These kind of data are relatively small, name @@ -1449,8 +1434,8 @@ of connection with no message loss. ### API server The API server must handle HTTP requests and manage the state of the application -in some kind of a database. It must also be able to communicate with the -backend over ZeroMQ. +in some kind of a database. It must also be able to communicate with the backend +over ZeroMQ. We considered several technologies which could be used: @@ -1566,6 +1551,36 @@ including generating the signature and signature verification is done through a widely used third-party library which lowers the risk of having a bug in the implementation of this critical security feature. +#### Forgotten password + +With authentication and some sort of dealing with passwords is related a problem +with forgotten credentials, especially passwords. People easily forget them and +there has to be some kind of mechanism to retrieve a new password or change the +old one. Problem is that it cannot be done in totally secure way, but we can at +least come quite close to it. First, there are absolutely not secure and +recommendable ways how to handle that, for example sending the old password +through email. A better, but still not secure solution is to generate a new one +and again send it through email. This solution was provided in CodEx, users had +to write an email to administrator, who generated a new password and sent it +back to the sender. This simple solution could be also automated, but +administrator had quite a big control over whole process. This might come in +handy if there could be some additional checkups for example, but on the other +hand it can be quite time consuming. + +Probably the best solution which is often used and is fairly secure is +following. Let us consider only case in which all users have to fill their +email addresses into the system and these addresses are safely in the hands of +the right users. When user finds out that he/she does not remember a password, +he/she requests a password reset and fill in his/her unique identifier; it might +be email or unique nickname. Based on matched user account the system generates +unique access token and sends it to user via email address. This token should be +time limited and usable only once, so it cannot be misused. User then takes the +token or URL address which is provided in the email and go to the system's +appropriate section, where new password can be set. After that user can sign in +with his/her new password. As previously stated, this solution is quite safe and +user can handle it on its own, so administrator does not have to worry about it. +That is the main reason why this approach was chosen to be used. + #### Uploading files There are two cases when users need to upload files using the API -- submitting