From abe471833bba3dbce0dcdab2b60706f4ff3df20d Mon Sep 17 00:00:00 2001
From: Martin Polanka <PolankaMartin@gmail.com>
Date: Sun, 29 Jan 2017 16:18:46 +0100
Subject: [PATCH] Copy broker content into new doc

---
 Broker.md         | 72 -----------------------------------------------
 Rewritten-docs.md | 70 +++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 68 insertions(+), 74 deletions(-)
 delete mode 100644 Broker.md

diff --git a/Broker.md b/Broker.md
deleted file mode 100644
index 77b95b2..0000000
--- a/Broker.md
+++ /dev/null
@@ -1,72 +0,0 @@
-# Broker
-
-The broker is a central part of the ReCodEx backend that directs almost all 
-communication. It was designed to properly maintain heavy load of messages
-by making only small actions in main communication thread and asynchronous
-execution of other actions.
-
-## Description
-
-The broker's responsibilites are:
-
-- allowing workers to register themselves and keep track of their capabilities
-- tracking status of each worker and handle cases when they crash
-- accepting assignment evaluation requests from the frontend and forwarding them 
-  to workers
-- receiving job status information from workers and forward it to the frontend 
-  either via monitor or REST API
-- notifying the frontend of errors in the backend
-
-## Architecture
-
-The broker uses our ZeroMQ _reactor_ to bind events on sockets to handler classes. 
-There are currently two handlers -- one that handles the main functionality and 
-another one that sends status reports to the REST API asynchronously so that the 
-broker does not have to wait for HTTP requests which can take a lot of time, 
-especially when some kind of error happens on the server.
-
-### Worker registry
-
-The `worker_registry` class is used to store information about workers, their 
-status and the jobs in their queue. It can look up a worker using the headers 
-received with a request (a worker is considered suitable if and only if it 
-satisfies all the headers). The headers are arbitrary key-value pairs, which 
-are checked for equality by the broker. However, some headers require special 
-handling, namely `threads`, for which we check if the value in the request is 
-lesser than or equal to the value advertised by the worker, and `hwgroup`, for
-which we support requesting one of multiple hardware groups by listing multiple
-names separated with a `|` symbol (e.g. `group_1|group_2|group_3`.
-
-The registry also implements a basic load balancing algorithm -- the 
-workers are contained in a queue and whenever one of them receives a job, it is 
-moved to its end, which makes it less likely to receive another job soon.
-
-When a worker is assigned a job, it will not be assigned another one until we 
-receive a `done` message from it.
-
-### Error handling
-
-**Job failure** -- we recognize two ways a job can fail -- an internally and 
-externally. An internal failure is the worker's fault -- for example when it 
-cannot download a file needed for the evaluation for some reason. An external 
-error is for example when the job configuration is malformed. Note that we do not 
-consider a student entering an incorrect solution a job failure.
-
-Jobs that failed internally are reassigned until a limit on the amount of 
-reassingments (configurable with the `max_request_failures` option) is reached. 
-External failures are reported to the frontend immediately.
-
-**Worker failure** -- when a worker crash is detected, we attempt to reassign its 
-current job and also all the jobs from its queue. Because the current job might 
-be the reason of the crash, its reassignment is also counted towards the 
-`max_request_failures` limit (the counter is shared). If there is no worker that 
-could process a job (i.e. it cannot be reassigned), the job is reported as 
-failed to the frontend via REST API.
-
-**Broker failure** -- when the broker itself crashed and is restarted, workers 
-will reconnect automatically. However, all jobs in their queues are lost. If a 
-worker manages to finish a job and notifies the "new" broker, the report is 
-forwarded to the frontend. The same goes for external failures. Jobs that fail 
-internally cannot be reassigned, because the "new" broker does not know their 
-headers -- they are reported as failed immediately.
-
diff --git a/Rewritten-docs.md b/Rewritten-docs.md
index 75bf83d..ac952bc 100644
--- a/Rewritten-docs.md
+++ b/Rewritten-docs.md
@@ -2656,10 +2656,76 @@ used.
 
 ## Broker
 
-@todo: gets stuff done, single point of failure and center point of ReCodEx universe
+The broker is a central part of the ReCodEx backend that directs almost all
+communication. It was designed to properly maintain heavy load of messages by
+making only small actions in main communication thread and asynchronous
+execution of other actions.
+
+The responsibilites of broker are:
+
+- allowing workers to register themselves and keep track of their capabilities
+- tracking status of each worker and handle cases when they crash
+- accepting assignment evaluation requests from the frontend and forwarding them
+  to workers
+- receiving job status information from workers and forward it to the frontend
+  either via monitor or REST API
+- notifying the frontend of errors in the backend
+
+### Architecture
+
+The broker uses our ZeroMQ _reactor_ to bind events on sockets to handler
+classes.  There are currently two handlers -- one that handles the main
+functionality and another one that sends status reports to the REST API
+asynchronously so that the broker does not have to wait for HTTP requests which
+can take a lot of time, especially when some kind of error happens on the
+server.
+
+#### Worker registry
+
+The `worker_registry` class is used to store information about workers, their
+status and the jobs in their queue. It can look up a worker using the headers
+received with a request (a worker is considered suitable if and only if it
+satisfies all the headers). The headers are arbitrary key-value pairs, which are
+checked for equality by the broker. However, some headers require special
+handling, namely `threads`, for which we check if the value in the request is
+lesser than or equal to the value advertised by the worker, and `hwgroup`, for
+which we support requesting one of multiple hardware groups by listing multiple
+names separated with a `|` symbol (e.g. `group_1|group_2|group_3`.
+
+The registry also implements a basic load balancing algorithm -- the workers are
+contained in a queue and whenever one of them receives a job, it is moved to its
+end, which makes it less likely to receive another job soon.
+
+When a worker is assigned a job, it will not be assigned another one until we
+receive a `done` message from it.
+
+#### Error handling
+
+**Job failure** -- we recognize two ways a job can fail -- an internally and
+ externally. An internal failure is the fault of worker -- for example when it
+ cannot download a file needed for the evaluation for some reason. An external
+ error is for example when the job configuration is malformed. Note that we do
+ not consider a student entering an incorrect solution a job failure.
+
+Jobs that failed internally are reassigned until a limit on the amount of
+reassingments (configurable with the `max_request_failures` option) is reached.
+External failures are reported to the frontend immediately.
+
+**Worker failure** -- when a worker crash is detected, we attempt to reassign
+ its current job and also all the jobs from its queue. Because the current job
+ might be the reason of the crash, its reassignment is also counted towards the
+ `max_request_failures` limit (the counter is shared). If there is no worker
+ that could process a job (i.e. it cannot be reassigned), the job is reported as
+ failed to the frontend via REST API.
+
+**Broker failure** -- when the broker itself crashed and is restarted, workers
+ will reconnect automatically. However, all jobs in their queues are lost. If a
+ worker manages to finish a job and notifies the "new" broker, the report is
+ forwarded to the frontend. The same goes for external failures. Jobs that fail
+ internally cannot be reassigned, because the "new" broker does not know their
+ headers -- they are reported as failed immediately.
 
 @todo: what to mention:
-	- job scheduling, worker queues
 	- API notification using curl, authentication using HTTP Basic Auth
 	- asynchronous resending progress messages