parts of Worker architecture rewritten

8 years ago · 68171c378a
parent 4c1fb87934
commit 68171c378a
1 changed files with 36 additions and 18 deletions
--- a/Worker.md
+++ b/Worker.md
@ -1,28 +1,16 @@
 # Worker

 ## Description
-The worker's job is to securely execute submitted assignments and possibly 
-evaluate results against model solutions provided by submitter. Worker is logicaly 
-divided into two parts:
-
- **Listener** - communicates with broker through 
-  [ZeroMQ](http://zeromq.org/). On startup, it introduces itself to the broker. 
-  Then it receives new jobs, passes them to the **evaluator** part and sends 
-  back results and progress reports.
- **Evaluator** - gets jobs from the **listener** part, evaluates them (possibly 
-  in sandbox) and notifies the other part when the evaluation ends. **Evaluator** 
-  also communicates with fileserver, downloads supplementary files and 
-  uploads detailed results.
-
-These parts run in separate threads of the same process and communicate through a ZeroMQ in-process socket. Alternative approach would be using shared memory region with unique access, but messaging is generally accepted to be safer. Shared memory has to be used very carefully because of race condition issues when reading and writing concurently. Also, messages inside worker are small, so there is no huge overhead copying data between threads. This two threaded design allows the worker to keep sending `ping` messages even when it is processing a job.

-After receiving an evaluation request, worker has to:
+The worker's job is to securely execute submitted assignments and possibly 
+evaluate results against model solutions provided by the exercise author. After 
+receiving an evaluation request, worker has to:

 - download the archive containing submitted source files and configuration file
 - download any supplementary files based on the configuration file, such as test 
  inputs or helper programs (this is done on demand, using a `fetch` command
  in the assignment configuration)
- evaluate the submission accordingly to job configuration
+- evaluate the submission according to job configuration
 - during evaluation progress messages can be sent back to broker
 - upload the results of the evaluation to the fileserver
 - notify broker that the evaluation finished
@ -49,11 +37,41 @@ These information are sent to the broker on startup using the `init` command.

 ### Internal communication

- listener/evaluator and their responsibilities - who sends jobs, progress etc
+Worker is logicaly divided into three parts:
+
+- **Listener** - communicates with broker through 
+  [ZeroMQ](http://zeromq.org/). On startup, it introduces itself to the broker. 
+  Then it receives new jobs, passes them to the **evaluator** part and sends 
+  back results and progress reports.
+- **Evaluator** - gets jobs from the **listener** part, evaluates them (possibly 
+  in sandbox) and notifies the other part when the evaluation ends. **Evaluator** 
+  also communicates with fileserver, downloads supplementary files and 
+  uploads detailed results.
+- **Progress callback** -- receives information about the progress of an 
+  evaluation from the evaluator and forwards them to the broker.
+
+These parts run in separate threads of the same process and communicate through
+ZeroMQ in-process sockets. Alternative approach would be using shared memory 
+region with unique access, but messaging is generally considered safer. Shared 
+memory has to be used very carefully because of race condition issues when 
+reading and writing concurrently. Also, messages inside worker are small, so 
+there is no big overhead copying data between threads. This multi-threaded 
+design allows the worker to keep sending `ping` messages even when it is 
+processing a job.

 ### File management

- caching, downloading, cleaning
+The messages sent by the broker to assign jobs to workers are rather simple - 
+they don't contain any files, only a URL of an archive with a job configuration. 
+When processing the job, it may also be necessary to fetch supplementary files 
+such as helper scripts or test inputs and outputs.
+
+Supplementary files are addressed using hashes of their content, which allows 
+for simple caching. Requested files are downloaded into the cache on demand. 
+This mechanism is hidden from the job evaluator, which depends on a 
+`file_manager_interface` instance. Because the filesystem cache can be shared 
+between more workers, cleaning functionality is implemented by the Cleaner 
+program that should be set up to run periodically.

 ### Running student submissions