From 9c82a9240fcfd26c22f3ea00d05cabe9ef327ca1 Mon Sep 17 00:00:00 2001 From: Petr Stefan Date: Wed, 28 Dec 2016 17:31:54 +0100 Subject: [PATCH 1/2] Update --- Rewritten-docs.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/Rewritten-docs.md b/Rewritten-docs.md index fccccc8..1357a53 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -279,6 +279,21 @@ facts, it is clear that the new system has to be written from scratch. This implies, that only subset of all features will be implemented in the first version, the others following later. +Gathered features are categorized based on priorities for the whole system. The +highest priority has main functionality similar to current CodEx. It is a base +line to be useful in production environment, but a new design allows to easily +develop further. On top of that, most of ideas from faculty staff belongs to +second priority bucket, which will be implemented as part of the project. Most +advanced tasks from this category are advanced low-level evaluation +configuration format, using modern tools, connecting to a university systems and +merging separate system instances into single one. Other tasks are scheduled for +next releases after successful project defense. Namely, these are high-level +exercise evaluation configuration with user-friendly interface for common +exercise types, SIS integration (when some API will be available from their +side) and command-line submit tool. Plagiarism detection is not likely to be +part of any release in near future unless someone other makes the engine. The +detection problem is too hard to be solved as part of this project. + The new project is **ReCodEx -- ReCodEx Code Examiner**. The name should point to CodEx, previous evaluation solution, but also reflect new approach to solve issues. **Re** as part of the name means redesigned, rewritten, renewed or From bbecd0e3f5cc67863e2cdd790980d77a9bf36c8b Mon Sep 17 00:00:00 2001 From: Martin Polanka Date: Wed, 28 Dec 2016 18:08:21 +0100 Subject: [PATCH 2/2] Worker and execution - impl analysis... again? --- Rewritten-docs.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 1194598..8353aca 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -568,15 +568,13 @@ ZeroMQ is possible to provide in-process messages working on the same principles as network communication which is quite handy and solves problems with threads synchronization and such. -At this point we have worker with two internal parts listening one and execution -one. Implementation of first one is quite straighforward and clear. So lets -discuss what should be happening in execution subsystem... +At this point we have worker with two internal parts listening one and execution one. Implementation of first one is quite straighforward and clear. So lets discuss what should be happening in execution subsystem. Jobs as work units can quite vary and do completely different things, that means configuration and worker has to be prepared for this kind of generality. Configuration and its solution was already discussed above, implementation in worker is then quite straightforward. Worker has internal structures to which loads and which stores metadata given in configuration. Whole job is mapped to job metadata structure and tasks are mapped to either external ones or internal ones (internal commands has to be defined within worker), both are different whether they are executed in sandbox or as internal worker commands. -@todo: complete paragraph above... execution of job on worker, how it is done, -what steps are necessary and general for all jobs +@todo: maybe describe folders within execution and what they can be used for? -@todo: how can inputs and outputs (and supplementary files) be handled (they can -be downloaded on start of execution, or during...) +After successful arrival of job, worker has to prepare new execution environment, then solution archive has to be downloaded from fileserver and extracted. Job configuration is located within these files and loaded into internal structures and executed. After that results are uploaded back to fileserver. These steps are the basic ones which are really necessary for whole execution and have to be executed in this precise order. + +Interesting problem is with supplementary files (inputs, sample outputs). There are two approaches which can be observed. Supplementary files can be downloaded either on the start of the execution or during execution. If the files are downloaded at the beginning execution does not really started at this point and if there are problems with network worker find it right away and can abort execution without executing single task. Slight problems can arise if some of the files needs to have same name (e.g. solution assumes that input is `input.txt`), in this scenario downloaded files cannot be renamed at the beginning but during execution which is somehow impractical and not easily observed. Second solution of this problem when files are downloaded on the fly has quite opposite problem, if there are problems with network worker will find it during execution when for instance almost whole execution is done, this is also not ideal solution if we care about burnt hardware resources. On the other hand using this approach users have quite advanced control of execution flow and know what files exactly are available during execution which is from users perspective probably more appealing then the first solution. Based on that downloading of supplementary files using 'fetch' tasks during execution was chosen and implemented. As described in fileserver section stored supplementary files have special filenames which reflects hashes of their content. As such there are no