more broker analysis

8 years ago · 320dff7817
parent 0e4f85e7b9
commit 320dff7817
1 changed files with 18 additions and 4 deletions
--- a/Rewritten-docs.md
+++ b/Rewritten-docs.md
@ -866,10 +866,24 @@ distributing jobs that it receives from the frontend between them.

 #### Worker management

-@todo initialization - broker is fixed, workers connect to it
-
-@todo heartbeating - workers send ping, the inverse is possible, too (doesn't
-really matter)
+It is intended for the broker to be a fixed part of the backend infrastructure
+to which workers connect at will. Thanks to this design, workers can be added
+and removed when necessary (and possibly in an automated fashion), without
+changing the configuration of the broker. An alternative solution would be
+configuring a list of workers before startup, thus making them passive in the
+communication (in the sense that they just wait for incoming jobs instead of
+connecting to the broker). However, this approach comes with a notable
+administration overhead -- in addition to starting a worker, the administrator
+would have to update the worker list.
+
+Worker management must also take into account the possibility of worker
+disconnection, either because of a network or software failure (or termination).
+A common way to detect such events in distributed systems is to periodically
+send short messages to other nodes and expect a response. When these messages
+stop arriving, we presume that the other node encountered a failure. Both the
+broker and workers can be made responsible for initiating these exchanges and it
+seems that there are no differences stemming from this choice. We decided that
+the workers will be the active party that initiates the exchange.

 #### Scheduling