diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 19d3591..6be98eb 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -487,7 +487,7 @@ everybody seems satisfied with this fact. There are other communicating channels most programmers use, such as e-mail or git, but they are inappropriate for designing user interfaces on top of them. -The application interacts with users. From the project assignment it is clear, +The application interacts with users. From the project assignment it is clear that the system has to keep personalized data about users and adapt presented content according to this knowledge. User data cannot be publicly visible, which implies necessity of user authentication. The application also has to support @@ -542,16 +542,16 @@ ReCodEx it is possible to offer hosted environment as a service to other subjects. The concept we came up with is based on user and group separation inside the -system. There are multiple _instances_ in the system, which means unit of -separation. Each instance has own set of users and groups, exercises can be -optionally shared. Evaluation backend is common for all instances. To keep track -of active instances and paying customers, each instance must have a valid -_licence_ to allow users submit their solutions. licence is granted for defined -period of time and can be revoked in advance if the subject do not keep approved -terms and conditions. - -The primary task of the system is to evaluate programming exercises. The -exercise is quite similar to homework assignment during school labs. When a +system. The system is divided into multiple separated units called _instances_. +Each instance has own set of users and groups. Exercises can be optionally +shared. The rest of the system (API server and evaluation backend) is shared +between the instances. To keep track of active instances and paying customers, +each instance must have a valid _licence_ to allow users submit their solutions. +licence is granted for a definite period of time and can be revoked in advance +if the subject does not conform with approved terms and conditions. + +The primary task of the system is to evaluate programming exercises. An +exercise is quite similar to a homework assignment during school labs. When a homework is assigned, two things are important to know for users: - description of the problem @@ -560,11 +560,11 @@ homework is assigned, two things are important to know for users: To reflect this idea teachers and students are already familiar with, we decided to keep separation between problem itself (_exercise_) and its _assignment_. Exercise only describes one problem and provides testing data with description -of how to evaluate it. In fact, it is template for assignments. Assignment then -contains data from its exercise and additional metadata, which can be different -for every assignment of the same exercise. This separation is natural for all -users, in CodEx it is implemented in similar way and no other considerable -solution was found. +of how to evaluate it. In fact, it is a template for assignments. Assignment +then contains data from its exercise and additional metadata, which can be +different for every assignment of the same exercise. This separation is natural +for all users, in CodEx it is implemented in similar way and no other +considerable solution was found. ### Evaluation unit executed by ReCodEx @@ -577,36 +577,41 @@ scratch is needed. There are two main approaches to design a complex execution configuration. It can be composed of small amount of relatively big components or much more small -tasks. Big components are easy to write and whole configuration is reasonably -small. The components are designed for current problems, so it is not scalable -enough for pleasant future usage. This can be solved by introducing small set of -single-purposed tasks which can be composed together. The whole configuration is -then quite bigger, but with great adaptation ability for new conditions and also -less amount of work programming them. For better user experience, configuration -generators for some common cases can be introduced. - -ReCodEx target is to be continuously developed and used for many years, so the -smaller tasks are the right choice. Observation of CodEx system shows that -only a few tasks are needed. In extreme case, only one task is enough -- execute -a binary. However, for better portability of configurations along different -systems it is better to implement reasonable subset of operations directly -without calling system provided binaries. These operations are copy file, create -new directory, extract archive and so on, altogether called internal tasks. -Another benefit from custom implementation of these tasks is guarantied safety, -so no sandbox needs to be used as in external tasks case. - -For a job evaluation, the tasks needs to be executed sequentially in a specified -order. The idea of running independent tasks in parallel is bad because exact -time measurement needs controlled environment on target computer with -minimization of interrupts by other processes. It would be possible to run tasks -which does not need exact time measuremet in parallel, but in this case a +tasks. Big components are easy to write and help keeping the configuration +reasonably small. However, these components are designed for current problems +and they might not hold well against future requirements. This can be solved by +introducing a small set of single-purposed tasks which can be composed together. +The whole configuration becomes bigger, but more flexible for new conditions. +Moreover, they will not require as much programming effort as bigger evaluation +units. For better user experience, configuration generators for some common +cases can be introduced. + +A goal of ReCodEx is to be continuously developed and used for many years. +Therefore, we chose to use smaller tasks, because this approach is better for +future extensibility. Observation of CodEx system shows that only a few tasks +are needed. In an extreme case, only one task is enough -- execute a binary. +However, for better portability of configurations between different systems it +is better to implement a reasonable subset of operations ourselves without +calling binaries provided by the system directly. These operations are copy +file, create new directory, extract archive and so on, altogether called +internal tasks. Another benefit from custom implementation of these tasks is +guarantied safety, so no sandbox needs to be used as in external tasks case. + +For a job evaluation, the tasks need to be executed sequentially in a specified +order. Running independent tasks is possible, but there are complications -- +exact time measurement requires a controlled environment with as few +interruptions as possible from other processes. It would be possible to run +tasks that do not need exact time measuremet in parallel, but in this case a synchronization mechanism has to be developed to exclude paralellism for measured tasks. Usually, there are about four times more unmeasured tasks than -tasks with time measurement, but measured tasks tends to be much longer. With +tasks with time measurement, but measured tasks tend to be much longer. With [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl's_law) in mind, the -parallelism seems not to provide a huge benefit in overall execution speed and -brings troubles with synchronization. However, it there will be speed issues, -this approach could be reconsiderred. +parallelism does not seem to provide a notable benefit in overall execution +speed and brings trouble with synchronization. Moreover, most of the internal +tasks are also limited by IO speed (most notably copying and downloading files +and reading archives). However, if there are performance issues, this approach +could be reconsiderred, along with using a ram disk for storing supplementary +files. It seems that connecting tasks into directed acyclic graph (DAG) can handle all possible problem cases. None of the authors, supervisors and involved faculty @@ -618,7 +623,7 @@ For better understanding, here is a small example. ![Task serialization](https://github.com/ReCodEx/wiki/raw/master/images/Assignment_overview.png) -The _job root_ task is imaginary single starting point of each job. When the +The _job root_ task is an imaginary single starting point of each job. When the _CompileA_ task is finished, the _RunAA_ task is started (or _RunAB_, but should be deterministic by position in configuration file -- tasks stated earlier should be executed earlier). The task priorities guaranties, that after @@ -634,13 +639,13 @@ clean the big temporary file and proceed with following test. If there is an ambiguity in task ordering at this point, they are executed in order of input task configuration. -The total linear ordering of tasks can be done easier with just executing them -in order of input configuration. But this structure cannot handle well cases, -when a task fails. There is not a easy and nice way how to tell which task -should be executed next. However, this issue can be solved with graph structured +The total linear ordering of tasks can be made easier with just executing them +in order of input configuration. But this structure cannot handle cases, when a +task fails very well. There is no easy way of telling which task should be +executed next. However, this issue can be solved with graph structured dependencies of the tasks. In graph structure, it is clear that all dependent -tasks has to be skipped and continue execution with a non related task. This is -the main reason, why the tasks are connected in a DAG. +tasks have to be skipped and execution must be resumed with a non related task. +This is the main reason, why the tasks are connected in a DAG. For grading there are several important tasks. First, tasks executing submitted code need to be checked for time and memory limits. Second, outputs of judging @@ -719,11 +724,11 @@ them. To avoid assigning points for insufficient solutions (like only printing "File error" which is the valid answer in two tests), a minimal point threshold can be -specified. It the solution is to get less points than specified, it will get +specified. If the solution is to get less points than specified, it will get zero points instead. This functionality can be embedded into grading computation algoritm itself, but it would have to be present in each implementation -separately, which is not maintainable. Because of this the the threshold feature -is separated from point computation. +separately, which is not maintainable. Because of this the threshold feature is +separated from score computation. Automatic grading cannot reflect all aspects of submitted code. For example, structuring the code, number and quality of comments and so on. To allow @@ -744,16 +749,16 @@ previous chapter, there are also text or binary outputs of the executed tasks. Knowing them helps users identify and solve their potential issues, but on the other hand this can lead to possibility of leaking input data. This may lead students to hack their solutions to pass just the ReCodEx testing cases instead -of properly solving the assigned problem. The usual approach is to keep these -information private and so does strongly recommended Martin Mareš, who has +of properly solving the assigned problem. The usual approach is to keep this +information private. This was also strongly recommended by Martin Mareš, who has experience with several programming contests. -The only one exception of hiding the logs are compilation outputs, which can -help students a lot during troubleshooting and there is only small possibility +The only one exception from hiding the logs are compilation outputs, which can +help students a lot during troubleshooting and there is only a small possibility of input data leakage. The supervisors have access to all of the logs and they can decide if students are allowed to see the compilation outputs. -Note, that due to lack of frontend developers, showing compilation logs to the +Note that due to lack of frontend developers, showing compilation logs to the students is not implemented in the very first release of ReCodEx. ### Persistence @@ -768,29 +773,29 @@ factor. There are several ways how to save structured data: - relational database Another important factor is amount and size of stored data. Our guess is about -1000 users, 100 exercises, 200 assignments per year and 200000 unique solutions +1000 users, 100 exercises, 200 assignments per year and 20000 unique solutions per year. The data are mostly structured and there are a lot of them with the same format. For example, there is a thousand of users and each one has the same -values -- name, email, age, etc. These kind of data are relatively small, name +values -- name, email, age, etc. These data items are relatively small, name and email are short strings, age is an integer. Considering this, relational -databases or formatted plain files (CSV for example) fits best for them. -However, the data often have to support find operation, so they have to be -sorted and allow random access for resolving cross references. Also, addition a -deletion of entries should take reasonable time (at most logarithmic time +databases or formatted plain files (CSV for example) fit best for them. +However, the data often have to support searching, so they have to be +sorted and allow random access for resolving cross references. Also, addition +and deletion of entries should take reasonable time (at most logarithmic time complexity to number of saved values). This practically excludes plain files, so -relational database is used instead. - -On the other hand, there are some data with no such great structure and much -larger size. These can be evaluation logs, sample input files for exercises or -submitted sources by students. Saving this kind of data into relational database -is not suitable, but it is better to keep them as ordinary files or store them -into some kind of NoSQL database. Since they are already files and does not need -to be backed up in multiple copies, it is easier to keep them as ordinary files -in filesystem. Also, this solution is more lightweight and does not require -additional dependencies on third-party software. File can be identified using -its filesystem path or unique index stored as value in relational database. Both -approaches are equally good, final decision depends on actual case. - +we decided to use a relational database. + +On the other hand, there is data with basically no structure and much larger +size. These can be evaluation logs, sample input files for exercises or sources +submitted by students. Saving this kind of data into a relational database is +not appropriate. It is better to keep them as ordinary files or store them in +some kind of NoSQL database. Since they are already files and do not need to be +backed up in multiple copies, it is easier to keep them as ordinary files in the +filesystem. Also, this solution is more lightweight and does not require +additional dependencies on third-party software. Files can be identified using +their filesystem paths or a unique index stored as a value in a relational +database. Both approaches are equally good, final decision depends on the actual +implementation. ## Structure of the project @@ -804,7 +809,7 @@ working as expected. ### Backend -Backend is the part which is responsible solely for the process of evaluation +Backend is the part which is responsible solely for the process of evaluating a solution of an exercise. Each evaluation of a solution is referred to as a *job*. For each job, the system expects a configuration document of the job, supplementary files for the exercise (e.g., test inputs, expected outputs, @@ -814,40 +819,46 @@ job, such as a specific runtime environment, specific version of a compiler or the job must be evaluated on a processor with a specific number of cores. The backend infrastructure decides whether it will accept a job or decline it based on the specified requirements. In case it accepts the job, it will be placed in -a queue and it will be processed as soon as possible. The backend publishes the -progress of processing of the queued jobs and the results of the evaluations can -be queried after the job processing is finished. The backend produces a log of -the evaluation which can be used for further score calculation or debugging. +a queue and it will be processed as soon as possible. + +The backend publishes the progress of processing of the queued jobs and the +results of the evaluations can be queried after the job processing is finished. +The backend produces a log of the evaluation which can be used for further score +calculation or debugging. To make the backend scalable, there are two necessary components -- the one which will execute jobs and the other which will distribute jobs to the instances of the first one. This ensures scalability in manner of parallel -execution of numerous jobs. Implementation of these services are called -**broker** and **worker**, the first one handles distribution, the latter one -execution. These components could handle the whole evaluation process, but for -cleaner design and better communication gateways with frontend two other -components were added, **fileserver** and **monitor**. Fileserver is simple -component whose purpose is to store files which are exchanged between frontend -and backend. Monitor is a simple service which is able to serve job progress -state from worker to web application. These two additional components are on -the edge of frontend and backend (like gateways) but logically they are more -connected with backend, so it is considered they belong there. +execution of numerous jobs which is exactly what is needed. Implementation of +these services are called **broker** and **worker**, first one handles +distribution, the other one handles execution. + +These components should be enough to fulfill all tasks mentioned above, but for +the sake of simplicity and better communication, gateways with frontend two +other components were added -- **fileserver** and **monitor**. Fileserver is a +simple component whose purpose is to store files which are exchanged between +frontend and backend. Monitor is also quite a simple service which is able to +forward job progress data from worker to web application. These two additional +services are at the border between frontend and backend (like gateways) but +logically they are more connected with backend, so it is considered they belong +there. ### Frontend -Frontend on the other hand is responsible for providing users a convenient +Frontend on the other hand is responsible for providing users with convenient access to the backend infrastructure and interpreting raw data from backend -evaluation. There are two main purposes of frontend -- holding the state of -whole system (database of users, exercises, solutions, points, etc.) and -presenting the state to users through some kind of an user interface (e.g., a -web application, mobile application, or a command-line tool). According to -contemporary trends in development of frontend parts of applications, we -decided to split the frontend in two logical parts -- a server side and a -client side. The server side is responsible for managing the state and the -client side gives instructions to the server side based on the inputs from the -user. This decoupling gives us the ability to create multiple client side tools -which may address different needs of the users with preserving single server -side component. +evaluation. + +There are two main purposes of the frontend -- holding the state of the whole +system (database of users, exercises, solutions, points, etc.) and presenting +the state to users through some kind of a user interface (e.g., a web +application, mobile application, or a command-line tool). According to +contemporary trends in development of frontend parts of applications, we decided +to split the frontend in two logical parts -- a server side and a client side. +The server side is responsible for managing the state and the client side gives +instructions to the server side based on the inputs from the user. This +decoupling gives us the ability to create multiple client side tools which may +address different needs of the users. The frontend developed as part of this project is a web application created with the needs of the Faculty of Mathematics and Physics of the Charles university in @@ -870,7 +881,7 @@ fully accurate. ![Overall architecture](https://github.com/ReCodEx/wiki/blob/master/images/Overall_Architecture.png) -In the latter parts of the documentation, both of the backend and frontend parts +In the following parts of the documentation, both the backend and frontend parts will be introduced separately and covered in more detail. The communication protocol between these two logical parts will be described as well. @@ -935,15 +946,16 @@ However, all of the three options would have been possible to use. ### File transfers -There has to be a way to access files stored on the fileserver from both worker -and frontend server machines. The protocol used for this should handle large -files efficiently and be resilient to network failures. Security features are -not a primary concern, because all communication with the fileserver will happen -in an internal network. However, a basic form of authentication can be useful to -ensure correct configuration (if a development fileserver uses different -credentials than production, production workers will not be able to use it by -accident). Lastly, the protocol must have a client library for platforms -(languages) used in the backend. We will present some of the possible options: +There has to be a way to access files stored on the fileserver (and also upload +them )from both worker and frontend server machines. The protocol used for this +should handle large files efficiently and be resilient to network failures. +Security features are not a primary concern, because all communication with the +fileserver will happen in an internal network. However, a basic form of +authentication can be useful to ensure correct configuration (if a development +fileserver uses different credentials than production, production workers will +not be able to use it by accident). Lastly, the protocol must have a client +library for platforms (languages) used in the backend. We will present some of +the possible options: - HTTP(S) -- a de-facto standard for web communication that has far more features than just file transfers. Thanks to being used on the web, a large @@ -1110,15 +1122,15 @@ services, for example via HTTP. ### Worker -Worker is component which is supposed to execute incoming jobs from broker. As +Worker is a component which is supposed to execute incoming jobs from broker. As such worker should work and support wide range of different infrastructures and maybe even platforms/operating systems. Support of at least two main operating systems is desirable and should be implemented. -Worker as a service does not have to be much complicated, but a bit of complex +Worker as a service does not have to be very complicated, but a bit of complex behaviour is needed. Mentioned complexity is almost exclusively concerned about robust communication with broker which has to be regularly checked. Ping -mechanism is usually used for this in all kind of projects. This means that +mechanism is usually used for this in all kind of projects. This means that the worker should be able to send ping messages even during execution. So worker has to be divided into two separate parts, the one which will handle communication with broker and the another which will execute jobs. @@ -1126,9 +1138,9 @@ with broker and the another which will execute jobs. The easiest solution is to have these parts in separate threads which somehow tightly communicates with each other. For inter process communication there can be used numerous technologies, from shared memory to condition variables or some -kind of in-process messages. Already used library ZeroMQ is possible to provide -in-process messages working on the same principles as network communication -which is quite handy and solves problems with threads synchronization and such. +kind of in-process messages. The ZeroMQ library which we already use provides +in-process messages that work on the same principles as network communication, +which is convenient and solves problems with thread synchronization. #### Evaluation @@ -1629,13 +1641,14 @@ implemented in some of the next releases. ### The WebApp -The web application ("WebApp") is one of the possible client applications of the ReCodEx -system. Creating a web application as the first client application has several advantages: +The web application ("WebApp") is one of the possible client applications of the +ReCodEx system. Creating a web application as the first client application has +several advantages: - no installation or setup is required on the user's device - works on all platforms including mobile devices -- when a new version is rolled out all the clients will use this version without -any need for manula instalation of the update +- when a new version is released, all the clients will use this version without +any need for manual instalation of the update One of the downsides is the large number of different web browsers (including the older versions of a specific browser) and their different interpretation