diff --git a/Rewritten-docs.md b/Rewritten-docs.md index efab2a9..6e4687d 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -674,11 +674,73 @@ CORBA, RabbitMQ and why is ZeroMQ great ### Broker -@todo: assigning of jobs to workers, which are possible algorithms, queues, which one was chosen - -@todo: how can jobs be sent over zeromq, mainly mention that files can be transported, but it is not feasible - -@todo: making action and reaction over zeromq more general and easily extensible, mention reactor and why is needed and what it solves +The broker is responsible for keeping track of available workers and +distributing jobs that it receives from the frontend between them. + +#### Worker management + +@todo initialization - broker is fixed, workers connect to it + +@todo heartbeating - workers send ping, the inverse is possible, too (doesn't +really matter) + +#### Scheduling + +Jobs should be scheduled in a way that ensures that they will be processed +without unnecessary waiting. This depends on the fairness of the scheduling +algorithm (no worker machine should be overloaded). + +The design of such scheduling algorithm is complicated by the requirements on +the diversity of workers -- they can differ in operating systems, available +software, computing power and many other aspects. + +We decided to keep the details of connected workers hidden from the frontend, +which should lead to a better separation of responsibilities and flexibility. +Therefore, the frontend needs a way of communicating its requirements on the +machine that processes a job without knowing anything about the available +workers. A key-value structure is suitable for representing such requirements. + +With respect to these constraints, and because the analysis and design of a more +sophisticated solution was declared out of scope of our project assignment, a +rather simple scheduling algorithm was chosen. The broker shall maintain a queue +of available workers. When assigning a job, it traverses this queue and chooses +the first machine that matches the requirements of the job. This machine is then +moved to the end of the queue. + +Presented algorithm results in a simple round-robin load balancing strategy, +which should be sufficient for small-scale deployments (such as a single +university). However, with a large amount of jobs, some workers will easily +become overloaded. The implementation must allow for a simple replacement of the +load balancing strategy so that this problem can be solved in the near future. + +#### Forwarding jobs + +Information about a job can be divided in two disjoint parts -- what the worker +needs to know to process it and what the broker needs to forward it to the +correct worker. It remains to be decided how this information will be +transferred to its destination. + +It is technically possible to transfer all the data required by the worker at +once through the broker. This package could contain submitted files, test +data, requirements on the worker, etc. A drawback of this solution is that +both submitted files and test data can be rather large. Furthermore, it is +likely that test data would be transferred many times. + +Because of these facts, we decided to store data required by the worker using a +shared storage space and only send a link to this data through the broker. This +approach leads to a more efficient network and resource utilization (the broker +doesn't have to process data that it doesn't need), but also makes the job +submission flow more complicated. + +#### Further requirements + +The broker can be viewed as a central point of the backend. While it has only +two primary, closely related responsibilities, other requirements have arisen +(forwarding messages about job evaluation progress back to the frontend) and +will arise in the future. To facilitate such requirements, its architecture +should allow simply adding new communication flows. It should also be as +asynchronous as possible to enable efficient communication with external +services, for example via HTTP. ### Worker @@ -693,7 +755,7 @@ this in all kind of projects. This means that worker should be able to send ping messages even during execution. So worker has to be divided into two separate parts, the one which will handle communication with broker and the another which will execute jobs. The easiest solution is to have these parts in separate -threads which somehow tightly communicates with each other. For inner process +threads which somehow tightly communicates with each other. For inter process communication there can be used numerous technologies, from shared memory to condition variables or some kind of in-process messages. Already used library ZeroMQ is possible to provide in-process messages working on the same principles @@ -763,7 +825,7 @@ represent particular folders. Marks or signs can have form of some kind of special strings which can be called variables. These variables then can be used everywhere where filesystems paths are used within configuration file. This will solve problem with specific worker environment and specific hierarchy of -directories. Final form of variables is ${...} where triple dot is textual +directories. Final form of variables is \${...} where triple dot is textual description. This format was used because of special dollar sign character which cannot be used within filesystem path, braces are there only to border textual description of variable.