|
|
@ -674,11 +674,73 @@ CORBA, RabbitMQ and why is ZeroMQ great
|
|
|
|
|
|
|
|
|
|
|
|
### Broker
|
|
|
|
### Broker
|
|
|
|
|
|
|
|
|
|
|
|
@todo: assigning of jobs to workers, which are possible algorithms, queues, which one was chosen
|
|
|
|
The broker is responsible for keeping track of available workers and
|
|
|
|
|
|
|
|
distributing jobs that it receives from the frontend between them.
|
|
|
|
@todo: how can jobs be sent over zeromq, mainly mention that files can be transported, but it is not feasible
|
|
|
|
|
|
|
|
|
|
|
|
#### Worker management
|
|
|
|
@todo: making action and reaction over zeromq more general and easily extensible, mention reactor and why is needed and what it solves
|
|
|
|
|
|
|
|
|
|
|
|
@todo initialization - broker is fixed, workers connect to it
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@todo heartbeating - workers send ping, the inverse is possible, too (doesn't
|
|
|
|
|
|
|
|
really matter)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Scheduling
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jobs should be scheduled in a way that ensures that they will be processed
|
|
|
|
|
|
|
|
without unnecessary waiting. This depends on the fairness of the scheduling
|
|
|
|
|
|
|
|
algorithm (no worker machine should be overloaded).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The design of such scheduling algorithm is complicated by the requirements on
|
|
|
|
|
|
|
|
the diversity of workers -- they can differ in operating systems, available
|
|
|
|
|
|
|
|
software, computing power and many other aspects.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
We decided to keep the details of connected workers hidden from the frontend,
|
|
|
|
|
|
|
|
which should lead to a better separation of responsibilities and flexibility.
|
|
|
|
|
|
|
|
Therefore, the frontend needs a way of communicating its requirements on the
|
|
|
|
|
|
|
|
machine that processes a job without knowing anything about the available
|
|
|
|
|
|
|
|
workers. A key-value structure is suitable for representing such requirements.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
With respect to these constraints, and because the analysis and design of a more
|
|
|
|
|
|
|
|
sophisticated solution was declared out of scope of our project assignment, a
|
|
|
|
|
|
|
|
rather simple scheduling algorithm was chosen. The broker shall maintain a queue
|
|
|
|
|
|
|
|
of available workers. When assigning a job, it traverses this queue and chooses
|
|
|
|
|
|
|
|
the first machine that matches the requirements of the job. This machine is then
|
|
|
|
|
|
|
|
moved to the end of the queue.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Presented algorithm results in a simple round-robin load balancing strategy,
|
|
|
|
|
|
|
|
which should be sufficient for small-scale deployments (such as a single
|
|
|
|
|
|
|
|
university). However, with a large amount of jobs, some workers will easily
|
|
|
|
|
|
|
|
become overloaded. The implementation must allow for a simple replacement of the
|
|
|
|
|
|
|
|
load balancing strategy so that this problem can be solved in the near future.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Forwarding jobs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Information about a job can be divided in two disjoint parts -- what the worker
|
|
|
|
|
|
|
|
needs to know to process it and what the broker needs to forward it to the
|
|
|
|
|
|
|
|
correct worker. It remains to be decided how this information will be
|
|
|
|
|
|
|
|
transferred to its destination.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It is technically possible to transfer all the data required by the worker at
|
|
|
|
|
|
|
|
once through the broker. This package could contain submitted files, test
|
|
|
|
|
|
|
|
data, requirements on the worker, etc. A drawback of this solution is that
|
|
|
|
|
|
|
|
both submitted files and test data can be rather large. Furthermore, it is
|
|
|
|
|
|
|
|
likely that test data would be transferred many times.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Because of these facts, we decided to store data required by the worker using a
|
|
|
|
|
|
|
|
shared storage space and only send a link to this data through the broker. This
|
|
|
|
|
|
|
|
approach leads to a more efficient network and resource utilization (the broker
|
|
|
|
|
|
|
|
doesn't have to process data that it doesn't need), but also makes the job
|
|
|
|
|
|
|
|
submission flow more complicated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Further requirements
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The broker can be viewed as a central point of the backend. While it has only
|
|
|
|
|
|
|
|
two primary, closely related responsibilities, other requirements have arisen
|
|
|
|
|
|
|
|
(forwarding messages about job evaluation progress back to the frontend) and
|
|
|
|
|
|
|
|
will arise in the future. To facilitate such requirements, its architecture
|
|
|
|
|
|
|
|
should allow simply adding new communication flows. It should also be as
|
|
|
|
|
|
|
|
asynchronous as possible to enable efficient communication with external
|
|
|
|
|
|
|
|
services, for example via HTTP.
|
|
|
|
|
|
|
|
|
|
|
|
### Worker
|
|
|
|
### Worker
|
|
|
|
|
|
|
|
|
|
|
@ -693,7 +755,7 @@ this in all kind of projects. This means that worker should be able to send ping
|
|
|
|
messages even during execution. So worker has to be divided into two separate
|
|
|
|
messages even during execution. So worker has to be divided into two separate
|
|
|
|
parts, the one which will handle communication with broker and the another which
|
|
|
|
parts, the one which will handle communication with broker and the another which
|
|
|
|
will execute jobs. The easiest solution is to have these parts in separate
|
|
|
|
will execute jobs. The easiest solution is to have these parts in separate
|
|
|
|
threads which somehow tightly communicates with each other. For inner process
|
|
|
|
threads which somehow tightly communicates with each other. For inter process
|
|
|
|
communication there can be used numerous technologies, from shared memory to
|
|
|
|
communication there can be used numerous technologies, from shared memory to
|
|
|
|
condition variables or some kind of in-process messages. Already used library
|
|
|
|
condition variables or some kind of in-process messages. Already used library
|
|
|
|
ZeroMQ is possible to provide in-process messages working on the same principles
|
|
|
|
ZeroMQ is possible to provide in-process messages working on the same principles
|
|
|
@ -763,7 +825,7 @@ represent particular folders. Marks or signs can have form of some kind of
|
|
|
|
special strings which can be called variables. These variables then can be used
|
|
|
|
special strings which can be called variables. These variables then can be used
|
|
|
|
everywhere where filesystems paths are used within configuration file. This will
|
|
|
|
everywhere where filesystems paths are used within configuration file. This will
|
|
|
|
solve problem with specific worker environment and specific hierarchy of
|
|
|
|
solve problem with specific worker environment and specific hierarchy of
|
|
|
|
directories. Final form of variables is ${...} where triple dot is textual
|
|
|
|
directories. Final form of variables is \${...} where triple dot is textual
|
|
|
|
description. This format was used because of special dollar sign character which
|
|
|
|
description. This format was used because of special dollar sign character which
|
|
|
|
cannot be used within filesystem path, braces are there only to border textual
|
|
|
|
cannot be used within filesystem path, braces are there only to border textual
|
|
|
|
description of variable.
|
|
|
|
description of variable.
|
|
|
|