diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 7650f25..34c36be 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -1,56 +1,3 @@ - - # Introduction Generally, there are many different ways and opinions on how to teach people @@ -139,16 +86,15 @@ corresponds to his/her privileges. There are user groups reflecting the structure of lectured courses. A database of exercises (algorithmic problems) is another part of the project. -Each exercise consists of a text describing the problem (optionally in two -language variants -- Czech and English), an evaluation configuration -(machine-readable instructions on how to evaluate solutions to the exercise) and -a set of inputs and reference outputs. Exercises are created by instructed -privileged users. Assigning an exercise to a group means choosing one of the -available exercises and specifying additional properties: a deadline (optionally -a second deadline), a maximum amount of points, a configuration for calculating -the score, a maximum number of submissions, and a list of supported runtime -environments (e.g. programming languages) including specific time and memory -limits for each one. +Each exercise consists of a text describing the problem, an evaluation +configuration (machine-readable instructions on how to evaluate solutions to the +exercise), time and memory limits for all supported runtimes (e.g. programming +languages), a configuration for calculating the final score and a set of inputs +and reference outputs. Exercises are created by instructed privileged users. +Assigning an exercise to a group means choosing one of the available exercises +and specifying additional properties: a deadline (optionally a second deadline), +a maximum amount of points, a maximum number of submissions and a list of +supported runtime environments. Typical use cases for supported user roles are following: @@ -160,7 +106,7 @@ Typical use cases for supported user roles are following: evaluation process - view solution results -- which parts succeeded and failed, total number of acquired points, bonus points -- **supervisor** +- **supervisor** (similar to CodEx **operator**) - create exercise -- create description text and evaluation configuration (for each programming environment), upload testing inputs and outputs - assign exercise to group -- choose exercise and set deadlines, number of @@ -195,10 +141,12 @@ Incoming jobs are kept in a queue until a free worker picks them. Workers are capable of sequential evaluation of jobs, one at a time. The worker obtains the solution and its evaluation configuration, parses it and -starts executing the contained instructions. It is crucial to keep the worker -computer secure and stable, so a sandboxed environment is used for dealing with -unknown source code. When the execution is finished, results are saved and the -submitter is notified. +starts executing the contained instructions. Each job should have more testing +cases, which examine wrong inputs, corner values and data of different sizes to +guess the program complexity. It is crucial to keep the worker computer secure +and stable, so a sandboxed environment is used for dealing with unknown source +code. When the execution is finished, results are saved and the submitter is +notified. The output of the worker contains data about the evaluation, such as time and memory spent on running the program for each test input and whether its output @@ -225,10 +173,10 @@ several drawbacks. The main ones are: test multi-threaded applications as well. - **instances** -- Different ways of CodEx usage scenarios requires separate installations (Programming I and II, Java, C#, etc.). This configuration is - not user friendly (students have to register in each instance separately) and - burdens administrators with unnecessary work. CodEx architecture does not - allow sharing hardware between instances, which results in an inefficient use - of hardware for evaluation. + not user friendly (students have to register in each installation separately) + and burdens administrators with unnecessary work. CodEx architecture does not + allow sharing workers between installations, which results in an inefficient + use of hardware for evaluation. - **task extensibility** -- There is a need to test and evaluate complicated programs for classes such as Parallel programming or Compiler principles, which have a more difficult evaluation chain than simple @@ -249,17 +197,12 @@ In general, CodEx features should be preserved, so only differences are presented here. For clear arrangement all the requirements and wishes are presented grouped by categories. -### System Features - -System features represents directly accessible functionality to users of the -system. They describe the evaluation system in general and also university -addons (mostly administrative features). - -#### Requirements of The Users +### Requirements of The Users - _group hierarchy_ -- creating an arbitrarily nested tree structure should be supported to allow keeping related groups together, such as in the example - below. A group hierarchy also allows archiving data from past courses. + below. CodEx supported only a flat group structure. A group hierarchy also + allows archiving data from past courses. ``` Summer term 2016 @@ -271,33 +214,31 @@ addons (mostly administrative features). ... ``` -- _a database of exercises_ -- teachers should be able to create exercises - including textual description, sample inputs and correct reference outputs - (for example "sum all numbers from given file and write the result to the - standard output") and to browse this database +- _a database of exercises_ -- teachers should be able to filter viewed + exercises according to several criteria, for example supported runtime + environment or author. It should also be possible to link exercises to a group + so that groups supervisors do not have to browse hundreds of exercises when + their group only uses five of them +- _advanced exercises_ -- the system should support more advanced evaluation + pipeline than basic compilation/execution/evaluation which is in CodEx - _customizable grading system_ -- teachers need to specify the way of - computation of the final score, which will be awarded to the submissions of the student depending on their quality -- _viewing student details_ -- teachers should be able to view the details of - their students (members of their groups), including all submitted solutions -- _awarding additional points_ -- adding (or subtracting) points from the final - score of a submission by a supervisor must be supported + computation of the final score, which will be awarded to the submissions of + the student depending on their quality - _marking a solution as accepted_ -- the system should allow marking one particular solution as accepted (used for grading the assignment) by the supervisor -- _solution resubmission_ -- teachers should be able edit the solutions of the student - and privately resubmit them, optionally saving all results (including +- _solution resubmission_ -- teachers should be able edit the solutions of the + student and privately resubmit them, optionally saving all results (including temporary ones); this feature can be used to quickly fix obvious errors in the solution and see if it is otherwise viable - _localization_ -- all texts (UI and exercises) should be translatable - _formatted exercise texts_ -- Markdown or another lightweight markup language should be supported for formatting exercise texts -- _exercise tags_ -- the system should support tagging exercises searching by - these tags - _comments_ -- adding both private and public comments to exercises, tests and solutions should be supported - _plagiarism detection_ -#### Administrative Requirements +### Administrative Requirements - _pluggable user interface_ -- the system should allow using an alternative user interface, such as a command line client; implementation of such clients @@ -310,10 +251,10 @@ addons (mostly administrative features). OAuth, should be supported - _querying SIS_ -- loading user data from the university information system should be supported -- _sandboxing_ -- there should be a safe environment in which the - solutions of the students are executed to prevent system failures due to malicious code being - submitted; the sandboxed environment should have the least possible impact on - measurement results (most importantly on measured times) +- _sandboxing_ -- there should be more advanced sandboxing which supports + execution of parallel programs and easy integration of different programming + environments and tools; the sandboxed environment should have the least + possible impact on measurement results (most importantly on measured times) - _heterogeneous worker pool_ -- there must be support for submission evaluation in multiple programming environments in a single installation to avoid unacceptable workload for the administrator (maintaining a separate @@ -328,14 +269,10 @@ addons (mostly administrative features). ### Non-functional Requirements -Non-functional requirements are requirements of technical character with no -direct mapping to visible parts of the system. In an ideal world, users should -not know about these features if they work properly, but would be at least -annoyed if they did not. - - _no installation_ -- the primary user interface of the system must be - accessible on the computers of the users without the need to install any additional - software + accessible on the computers of the users without the need to install any + additional software except for a web browser (which is installed on a vast + majority of personal computers) - _performance_ -- the system must be ready for at least hundreds of students and tens of supervisors using it at once - _automated deployment_ -- all of the components of the system must be easy to @@ -1035,6 +972,91 @@ HTTP(S). ![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png) +### Job Configuration File + +As discussed previously in 'Evaluation Unit Executed by ReCodEx' evaluation unit +will have form of job which will contain small tasks representing one piece of +work executed by worker. This implies jobs have to be somehow given from +frontend to backend. The best option for this is to use some kind of +configuration file which will represent particular jobs. Mentioned configuration +file should be specified in frontend and in backend, namely worker, will be +parsed and executed. + +There are many formats which can be used for configuration representation. The +ones which make sense are: + +- *XML* -- is broadly used general markup language which is flavoured with DTD + definition which can express and check XML file structure, so it does not have + to be checked within application. But XML with its tags can be sometimes quite + 'chatty' and extensive which does not have to be desirable. And overally XML + with all its features and properties can be a bit heavy-weight. +- *JSON* -- is notation which was developed to represent javascript objects. As + such it is quite simple, there can be expressed only: key-value structures, + arrays and primitive values. Structure and hierarchy of data is solved by + braces and brackets. +- *INI* -- is very simple configuration format which is able to represents only + key-value structures which can be grouped into sections. Which is not enough + to represent job and its tasks hierarchy. +- *YAML* -- format which is very similar to JSON with its capabilities. But with + small difference in structure and hirarchy of configuration which is solved + not with braces but with indentation. This means that YAML is easily readable + by both human and machine. +- *specific format* -- newly created format used just for job configuration. + Obvious drawback is non-existing parsers which would have to be written from + scratch. + +Given previous list of different formats we decided to use YAML. There are +existing parsers for most of the programming languages and it is easy enough to +learn and understand. Another choice which make sense is JSON but at the end +YAML seemed to be better. + +#### Configuration File Content + +@todo: discuss what should be in configuration: limits, dependencies, priorities... whatever + +#### Supplementary Files + +Interesting problem arise with supplementary files (e.g., inputs, sample +outputs). There are two approaches which can be observed. Supplementary files +can be downloaded either on the start of the execution or during execution. + +If the files are downloaded at the beginning, execution does not really started +at this point and thus if there are problems with network, worker will find it +right away and can abort execution without executing single task. Slight +problems can arise if some of the files needs to have same name (e.g. solution +assumes that input is `input.txt`), in this scenario downloaded files cannot be +renamed at the beginning but during execution which is somehow impractical and +not easily observed by the authors of job configurations. + +Second solution of this problem when files are downloaded on the fly has quite +opposite problem, if there are problems with network, worker will find it during +execution when for instance almost whole execution is done, this is also not +ideal solution if we care about burnt hardware resources. On the other hand +using this approach users have quite advanced control of execution flow and know +what files exactly are available during execution which is from users +perspective probably more appealing then the first solution. Based on that, +downloading of supplementary files using 'fetch' tasks during execution was +chosen and implemented. + +#### Job Variables + +Considering the fact that jobs can be executed within the worker on different +machines with specific settings, it can be handy to have some kind of mechanism +in the job configuration which will hide these particular worker details, most +notably specific directory structure. For this purpose marks or signs can be +used and can have a form of broadly used variables. + +Variables in general can be used everywhere where configuration values (not +keys) are expected. This implies that substitution should be done after parsing +of job configuration, not before. The only usage for variables which was +considered is for directories within worker, but in future this might be subject +to change. + +Final form of variables is `${...}` where triple dot is textual description. +This format was used because of special dollar sign character which cannot be +used within paths of regular filesystems. Braces are there only to border +textual description of variable. + ### Broker The broker is responsible for keeping track of available workers and @@ -1141,44 +1163,48 @@ kind of in-process messages. The ZeroMQ library which we already use provides in-process messages that work on the same principles as network communication, which is convenient and solves problems with thread synchronization. -#### Evaluation +#### Execution of Jobs At this point we have worker with two internal parts listening one and execution one. Implementation of first one is quite straightforward and clear. So let us discuss what should be happening in execution subsystem. -After successful arrival of job, worker has to prepare new execution -environment, then solution archive has to be downloaded from fileserver and -extracted. Job configuration is located within these files and loaded into +After successful arrival of the job from broker to the listening thread, the job +is immediatelly redirected to execution thread. In there worker has to prepare +new execution environment, solution archive has to be downloaded from fileserver +and extracted. Job configuration is located within these files and loaded into internal structures and executed. After that, results are uploaded back to fileserver. These steps are the basic ones which are really necessary for whole execution and have to be executed in this precise order. -#### Job Configuration - -Jobs as work units can quite vary and do completely different things, that means -configuration and worker has to be prepared for this kind of generality. -Configuration and its solution was already discussed above, implementation in -worker is then quite also quite straightforward. +The evaluation unit executed by ReCodEx and job configuration were already +discussed above. The conclusion was that jobs containing small tasks will be +used. Particular format of the actual job configuration can be found in 'Job +configuration' appendix. Implementation of parsing and storing these data in +worker is then quite straightforward. Worker has internal structures to which loads and which stores metadata given in configuration. Whole job is mapped to job metadata structure and tasks are mapped to either external ones or internal ones (internal commands has to be defined within worker), both are different whether they are executed in sandbox -or as internal worker commands. +or as an internal worker commands. + +#### Task Execution Failure Another division of tasks is by task-type field in configuration. This field can have four values: initiation, execution, evaluation and inner. All was discussed -and described above in configuration analysis. What is important to worker is +and described above in evaluation unit analysis. What is important to worker is how to behave if execution of task with some particular type fails. There are two possible situations execution fails due to bad user solution or due to some internal error. If execution fails on internal error solution cannot be declared overly as failed. User should not be punished for bad configuration -or some network error. This is where task types are useful. Generally -initiation, execution and evaluation are tasks which are somehow executing code +or some network error. This is where task types are useful. + +Initiation, execution and evaluation are tasks which are usually executing code which was given by users who submitted solution of exercise. If this kinds of tasks fail it is probably connected with bad user solution and can be evaluated. + But if some inner task fails solution should be re-executed, in best case scenario on different worker. That is why if inner task fails it is sent back to broker which will reassign job to another worker. More on this subject should be @@ -1214,45 +1240,6 @@ searching through this system should be easy. In addition if solutions of users have access only to evaluation directory then they do not have access to unnecessary files which is better for overall security of whole ReCodEx. -#### Job Variables - -As mentioned above worker has job directories but users who are writing and -managing job configurations do not know where they are (on some particular -worker) and how they can be accessed and written into configuration. For this -kind of task we have to introduce some kind of marks or signs which will -represent particular folders. Marks or signs can have form broadly used -variables. - -Variables can be used everywhere where filesystem paths are used within -configuration file. This will solve problem with specific worker environment and -specific hierarchy of directories. Final form of variables is `${...}` where -triple dot is textual description. This format was used because of special -dollar sign character which cannot be used within filesystem path, braces are -there only to border textual description of variable. - -#### Supplementary Files - -Interesting problem is with supplementary files (inputs, sample outputs). There -are two approaches which can be observed. Supplementary files can be downloaded -either on the start of the execution or during execution. If the files are -downloaded at the beginning, execution does not really started at this point and -if there are problems with network worker will find it right away and can abort -execution without executing single task. Slight problems can arise if some of -the files needs to have same name (e.g. solution assumes that input is -`input.txt`), in this scenario downloaded files cannot be renamed at the -beginning but during execution which is somehow impractical and not easily -observed. - -Second solution of this problem when files are downloaded on the fly has quite -opposite problem, if there are problems with network, worker will find it during -execution when for instance almost whole execution is done, this is also not -ideal solution if we care about burnt hardware resources. On the other hand -using this approach users have quite advanced control of execution flow and know -what files exactly are available during execution which is from users -perspective probably more appealing then the first solution. Based on that, -downloading of supplementary files using 'fetch' tasks during execution was -chosen and implemented. - ### Sandboxing There are numerous ways how to approach sandboxing on different platforms, @@ -1306,10 +1293,12 @@ But designing sandbox only for specific environment is possible, namely for C# and .NET. CLR as a virtual machine and runtime environment has a pretty good security support for restrictions and separation which is also transferred to C#. This makes it quite easy to implement simple sandbox within C# but there are -not any well known general purpose implementations. As said in previous -paragraph implementing our own solution is out of scope of project. But C# -sandbox is quite good topic for another project for example term project for C# -course so it might be written and integrated in future. +not any well known general purpose implementations. + +As mentioned in previous paragraphs implementing our own solution is out of +scope of project. But C# sandbox is quite good topic for another project for +example term project for C# course so it might be written and integrated in +future. ### Fileserver @@ -1658,20 +1647,21 @@ for implementation of a website. There are two basic ways how to create a website these days: -- **server-side approach** - the actions of the user are processed on the server and the - HTML code with the results of the action is generated on the server and sent - back to the web browser of the user. The client does not handle any logic - (apart from rendering of the user interface and some basic user interaction) - and is therefore very simple. The server can use the API server for processing - of the actions so the business logic of the server can be very simple as well. - A disadvantage of this approach is that a lot of redundant data is transferred - across the requests although some parts of the content can be cached (e.g., - CSS files). This results in longer loading times of the website. +- **server-side approach** - the actions of the user are processed on the server + and the HTML code with the results of the action is generated on the server + and sent back to the web browser of the user. The client does not handle any + logic (apart from rendering of the user interface and some basic user + interaction) and is therefore very simple. The server can use the API server + for processing of the actions so the business logic of the server can be very + simple as well. A disadvantage of this approach is that a lot of redundant + data is transferred across the requests although some parts of the content can + be cached (e.g., CSS files). This results in longer loading times of the + website. - **server-side rendering with asynchronous updates (AJAX)** - a slightly different approach is to render the page on the server as in the previous case - but then execute the actions of the user asynchronously using the `XMLHttpRequest` - JavaScript functionality. Which creates a HTTP request and transfers only the - part of the website which will be updated. + but then execute the actions of the user asynchronously using the + `XMLHttpRequest` JavaScript functionality. Which creates a HTTP request and + transfers only the part of the website which will be updated. - **client-side approach** - the opposite approach is to transfer the communication with the API server and the rendering of the HTML completely from the server directly to the client. The client runs the code (usually @@ -1698,8 +1688,8 @@ modern web applications. We examined several frameworks which are commonly used to speed up the development of a web application. There are several open source options available with a large number of tools, tutorials, and libraries. From the many -options (Backbone, Ember, Vue, Cycle.js, ...) there are two main frameworks worth -considering: +options (Backbone, Ember, Vue, Cycle.js, ...) there are two main frameworks +worth considering: - **Angular 2** - it is a new framework which was developed by Google. This framework is very complex and provides the developer with many tools which @@ -1899,7 +1889,7 @@ sign. In the second column there is a list of assigned exercises with its deadlines. If you want to quickly get to the groups page you might want to use provided "Show group's detail" button. -### Join Group and Start Solving Assignments +### Join Group To be able to submit solutions you have to be a member of the right group. Each instance has its own group hierarchy, so you can choose only those within your @@ -1916,13 +1906,15 @@ clicking on "See group's page" link following with "Join group" link. in hierarchy and membership cannot be established by students themselves. Management of students in this type of groups is in the hands of supervisors. -On the group detail page there are multiple interesting things for you. The first -one is a brief overview containing the information describing the group, there is a list of -supervisors and also the hierarchy of the subgroups. The most important section -is the "Student's dashboard" section. This section contains the list of assignments and -the list of fellow students. If the supervisors of the group allowed students to see the -statistic of their fellow students then there will also be the number of -points each of the students has gained so far. +On the group detail page there are multiple interesting things for you. The +first one is a brief overview containing the information describing the group, +there is a list of supervisors and also the hierarchy of the subgroups. The most +important section is the "Student's dashboard" section. This section contains +the list of assignments and the list of fellow students. If the supervisors of +the group allowed students to see the statistic of their fellow students then +there will also be the number of points each of the students has gained so far. + +### Start Solving Assignments In the "Assignments" box on the group detail page there is a list of assigned exercises which students are supposed to solve. The assignments are displayed @@ -1954,6 +1946,8 @@ your browser which will be displayed in another dialog window. When the whole execution is finished then a "See the results" button will appear and you can look at the results of your solution. +### View Results of Submission + On the results detail page there are a lot of information. Apart from assignment description, which is not connected to your results, there is also the solution submitter name (supervisor can submit a solution on your behalf), further there @@ -2003,10 +1997,10 @@ available only for group administrators. On "Dashboard" page you can find "Groups you supervise" section. Here there are boxes representing your groups with the list of students attending course and -their points. Student names are clickable with redirection to the profile of the user -where further information about his/hers assignments and solution can be found. -To quickly jump onto groups page, use "Show group's detail" button at the bottom -of the matching group box. +their points. Student names are clickable with redirection to the profile of the +user where further information about his/hers assignments and solution can be +found. To quickly jump onto groups page, use "Show group's detail" button at +the bottom of the matching group box. ### Manage Group @@ -2330,8 +2324,8 @@ appear in "Groups hierarchy" box at the top of the page. On the instance details page, there is a box "Licences". On the first line, it shows it this instance has currently valid licence or not. Then, there are -multiple lines with all licences assigned to this instance. Each line consists of -a note, validity status (if it is valid or revoked by superadministrator) and +multiple lines with all licences assigned to this instance. Each line consists +of a note, validity status (if it is valid or revoked by superadministrator) and the last date of licence validity. A box "Add new licence" is used for creating new licences. Required fields are @@ -2472,9 +2466,9 @@ submission: ``` Basically it means, that the job _hello-world-job_ needs to be run on workers -that belong to the `group_1` hardware group . Reference files are downloaded -from the default location configured in API (such as -`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job +that belong to the `group_1` hardware group . Reference files are downloaded +from the default location configured in API (such as +`http://localhost:9999/exercises`) if not stated explicitly otherwise. Job execution log will not be saved to result archive. Next the tasks have to be constructed under _tasks_ section. In this demo job, @@ -2622,9 +2616,9 @@ Broker implementation depends on several open-source C and C++ libraries. YAML format. - **boost-filesystem** -- Boost filesystem is used for managing logging directory (create if necessary) and parsing filesystem paths from strings as - written in the configuration of the broker. Filesystem operations will be included in - future releases of C++ standard, so this dependency may be removed in the - future. + written in the configuration of the broker. Filesystem operations will be + included in future releases of C++ standard, so this dependency may be + removed in the future. - **boost-program_options** -- Boost program options is used for parsing of command line positional arguments. It is possible to use POSIX `getopt` C function, but we decided to use boost, which provides nicer API and @@ -2662,9 +2656,9 @@ maintain backward compatibility). Fileserver stores its data in following structure: -- `./submissions//` -- folder that contains files submitted by users - (the solutions to the assignments of the student). `` is an identifier received from - the REST API. +- `./submissions//` -- folder that contains files submitted by users (the + solutions to the assignments of the student). `` is an identifier received + from the REST API. - `./submission_archives/.zip` -- ZIP archives of all submissions. These are created automatically when a submission is uploaded. `` is an identifier of the corresponding submission. @@ -2678,9 +2672,9 @@ Fileserver stores its data in following structure: ## Worker -The job of the worker is to securely execute a job according to its configuration and -upload results back for latter processing. After receiving an evaluation -request, worker has to do following: +The job of the worker is to securely execute a job according to its +configuration and upload results back for latter processing. After receiving an +evaluation request, worker has to do following: - download the archive containing submitted source files and configuration file - download any supplementary files based on the configuration file, such as test @@ -2772,13 +2766,13 @@ separate directory structure which is removed after finishing the job. The files are stored in local filesystem of the worker computer in a configurable location. The job is not restricted to use only specified -directories (tasks can do whatever is allowed on the target system), but it is +directories (tasks can do anything that is allowed by the system), but it is advised not to write outside them. In addition, sandboxed tasks are usually restricted to use only a specific (evaluation) directory. -The following directory structure is used for execution. The working -directory of the worker (root of the following paths) is shared for multiple instances on the -same computer. +The following directory structure is used for execution. The working directory +of the worker (root of the following paths) is shared for multiple instances on +the same computer. - `downloads/${WORKER_ID}/${JOB_ID}` -- place to store the downloaded archive with submitted sources and job configuration @@ -2804,7 +2798,7 @@ for comparison and exit code reflecting if the result is correct (0) of wrong (1). This interface lacks support for returning additional data by the judges, for -example similarity of the two files calculated as the edit distance of Levenshtein. +example similarity of the two files calculated as the Levenshtein edit distance. To allow passing these additional values an extended judge interface can be implemented: @@ -2843,16 +2837,18 @@ them are multi-platform, so both Linux and Windows builds are possible. Actual supported formats depends on installed packages on target system, but at least ZIP and TAR.GZ should be available. - **cppzmq** -- Cppzmq is a simple C++ wrapper for core ZeroMQ C API. It - basicaly contains only one header file, but its API fits into the object architecture of the worker. + basicaly contains only one header file, but its API fits into the object + architecture of the worker. - **spdlog** -- Spdlog is small, fast and modern logging library. It is used for all of the logging, both system and job logs. It is highly customizable and configurable from the configuration of the worker. - **yaml-cpp** -- Yaml-cpp is used for parsing and creating text files in YAML - format. That includes the configuration of the worker, the configuration and the results of a job. + format. That includes the configuration of the worker, the configuration and + the results of a job. - **boost-filesystem** -- Boost filesystem is used for multi-platform manipulation with files and directories. However, these operations will be - included in future releases of C++ standard, so this dependency may be - removed in the future. + included in future releases of C++ standard, so this dependency may be removed + in the future. - **boost-program_options** -- Boost program options is used for multi-platform parsing of command line positional arguments. It is not necessary to use it, similar functionality can be implemented be ourselves, but this well known @@ -2911,15 +2907,15 @@ command (normally FINISHED) is received, then are permanently deleted. This caching mechanism was implemented because early testing shows, that first couple of messages are missed quite often. -Messages from the queue of the client are sent through corresponding WebSocket connection -via main event loop as soon as possible. This approach with separate queue per -connection is easy to implement and guarantees reliability and order of message -delivery. +Messages from the queue of the client are sent through corresponding WebSocket +connection via main event loop as soon as possible. This approach with separate +queue per connection is easy to implement and guarantees reliability and order +of message delivery. ## Cleaner -Cleaner component is tightly bound to the worker. It manages the cache folder of the worker, -mainly deletes outdated files. Every cleaner instance maintains one +Cleaner component is tightly bound to the worker. It manages the cache folder of +the worker, mainly deletes outdated files. Every cleaner instance maintains one cache folder, which can be used by multiple workers. This means on one server there can be numerous instances of workers with the same cache folder, but there should be only one cleaner instance. @@ -3179,8 +3175,8 @@ no empty frames (unles explicitly specified otherwise). Broker acts as server when communicating with worker. Listening IP address and port are configurable, protocol family is TCP. Worker socket is of DEALER type, broker one is ROUTER type. Because of that, very first part of every (multipart) -message from broker to worker must be target the socket identity of the worker (which is -saved on its **init** command). +message from broker to worker must be target the socket identity of the worker +(which is saved on its **init** command). #### Commands from Broker to Worker: @@ -3284,13 +3280,13 @@ capable to send corresponding credentials with each request. #### Worker Side -Workers comunicate with the file server in both directions -- they download -the submissions of the student and then upload evaluation results. Internally, worker is -using libcurl C library with very similar setup. In both cases it can verify -HTTPS certificate (on Linux against system cert list, on Windows against -downloaded one from CURL website during installation), support basic HTTP -authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error -(returned HTTP status code is >=400). Worker have list of credentials to all +Workers comunicate with the file server in both directions -- they download the +submissions of the student and then upload evaluation results. Internally, +worker is using libcurl C library with very similar setup. In both cases it can +verify HTTPS certificate (on Linux against system cert list, on Windows against +downloaded one from CURL website during installation), support basic HTTP +authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error +(returned HTTP status code is >=400). Worker have list of credentials to all available file servers in its config file. - download file -- standard HTTP GET request to given URL expecting file content @@ -3315,20 +3311,20 @@ with proper configuration. Relevant commands for communication with workers: successful upload returns JSON `{ "result": "OK" }` as body of returned page. If not specified otherwise, `zip` format of archives is used. Symbol `/` in API -description is root of the domain of the file server. If the domain is for example -`fs.recodex.org` with SSL support, getting input file for one task could look as -GET request to +description is root of the domain of the file server. If the domain is for +example `fs.recodex.org` with SSL support, getting input file for one task could +look as GET request to `https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c`. ### Broker - Monitor Communication -Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of -socket is same on both sides, ROUTER. Monitor is set to act as server in this -communication, its IP address and port are configurable in the config of the monitor -file. ZeroMQ socket ID (set on the side of the monitor) is "recodex-monitor" and must be -sent as first frame of every multipart message -- see ZeroMQ ROUTER socket -documentation for more info. +Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of +socket is same on both sides, ROUTER. Monitor is set to act as server in this +communication, its IP address and port are configurable in the config of the +monitor file. ZeroMQ socket ID (set on the side of the monitor) is +"recodex-monitor" and must be sent as first frame of every multipart message -- +see ZeroMQ ROUTER socket documentation for more info. Note that the monitor is designed so that it can receive data both from the broker and workers. The current architecture prefers the broker to do all the @@ -3467,12 +3463,12 @@ Message format: ### Web App - Web API Communication -The provided web application runs as a JavaScript process inside the browser of the user. -It communicates with the REST API on the server through the standard HTTP requests. -Documentation of the main REST API is in a separate -[document](https://recodex.github.io/api/) due to its extensiveness. The results are -returned encoded in JSON which is simply processed by the web application and -presented to the user in an appropriate way. +The provided web application runs as a JavaScript process inside the browser of +the user. It communicates with the REST API on the server through the standard +HTTP requests. Documentation of the main REST API is in a separate +[document](https://recodex.github.io/api/) due to its extensiveness. The results +are returned encoded in JSON which is simply processed by the web application +and presented to the user in an appropriate way.