@ -9,15 +9,24 @@ Red connections are through ZeroMQ sockets, Blue are through WebSockets and Gree
## Internal worker communication
## Internal worker communication
Communication between the two worker threads is split into two separate parts, each one holding dedicated connection line. These internal lines are realized by ZeroMQ inproc PAIR sockets. For this section assume that the thread of the worker which communicates with broker is called _listening thread_ and the other one, which is evaluating incoming jobs is called _job thread_. _Listening thread_ is at both cases server (here is called `bind()` method), but because of ZeroMQ function it's not much important (`connect()` call in clients can precede server `bind()` call with no issue).
Communication between the two worker threads is split into two separate parts,
each one holding dedicated connection line. These internal lines are realized by
ZeroMQ inproc PAIR sockets. In this section we assume that the thread of the
worker which communicates with broker is called _listening thread_ and the other
one, which is evaluating incoming jobs is called _job thread_. _Listening
thread_ is a server in both cases (the one who calls the `bind()` method), but
because of how ZeroMQ works, it's not very important (`connect()` call in
clients can precede server `bind()` call with no issue).
### Main communication
### Main communication
Main communication is on `inproc://jobs` sockets. _Listening thread_ is waiting for any messages (from broker, jobs and progress sockets) and handle incoming requests properly.
Main communication is on `inproc://jobs` sockets. _Listening thread_ is waiting
for any messages (from broker, jobs and progress sockets) and passes incoming
requests to the _job thread_, which handles them properly.
Commands from _listening thread_ to _job thread_:
Commands from _listening thread_ to _job thread_:
- **eval** - evaluate a job. Requires 3 arguments:
- **eval** - evaluate a job. Requires 3 message frames:
- `job_id` - identifier of this job (in ASCII representation -- we avoid endianness issues and also
- `job_id` - identifier of this job (in ASCII representation -- we avoid endianness issues and also
support alphabetic ids)
support alphabetic ids)
- `job_url` - URI location of archive with job configuration and submitted source code
- `job_url` - URI location of archive with job configuration and submitted source code
@ -25,7 +34,7 @@ Commands from _listening thread_ to _job thread_:
Commands from _job thread_ to _listening thread_:
Commands from _job thread_ to _listening thread_:
- **done** - notifying of finished job. Requires 2 arguments:
- `result` - response result, one of "OK" and "ERR"
- `result` - response result, one of "OK" and "ERR"
@ -74,13 +83,25 @@ Commands from worker to broker:
## Broker - Monitor communication
## Broker - Monitor communication
Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of socket is same on both sides, ROUTER. Monitor is set as server in this communication, it's IP address and port are configurable in monitor's config file. ZeroMQ socket ID (set on monitor's side) is "recodex-monitor" and must be sent as first frame of every multipart message - see ZeroMQ ROUTER socket documentation for more info.
Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of
socket is same on both sides, ROUTER. Monitor is set as server in this
communication, its IP address and port are configurable in monitor's config
file. ZeroMQ socket ID (set on monitor's side) is "recodex-monitor" and must be
sent as first frame of every multipart message - see ZeroMQ ROUTER socket
documentation for more info.
Monitor is treated somehow as optional part of whole solution, so no special effort on communication realibility was made.
Note that the monitor is designed so that it can receive data both from the
broker and workers. The current architecture prefers the broker to do all the
communication so that the workers don't have to know too many network services.
Monitor is treated as a somewhat optional part of whole solution, so no special
effort on communication realibility was made.
Commands from monitor to broker:
Commands from monitor to broker:
There are none commands yet. Any message from monitor to broker is logged and discarded.
Because there is no need for the monitor to communicate with the broker, there
are no commands so far. Any message from monitor to broker is logged and
discarded.
Commands from broker to monitor:
Commands from broker to monitor:
@ -89,11 +110,13 @@ Commands from broker to monitor:
## Broker - Frontend communication
## Broker - Frontend communication
Broker communicates with frontend through ZeroMQ connection over TCP. Socket type on broker side is ROUTER, on frontend part it's REQ. Broker has server role, his IP address and port is configurable in frontend.
Broker communicates with frontend through ZeroMQ connection over TCP. Socket
type on broker side is ROUTER, on frontend part it's REQ. Broker acts as a
server, its IP address and port is configurable in frontend.
Commands from frontend to broker:
Commands from frontend to broker:
- **eval** - evaluate a job. Requires 3 arguments:
- **eval** - evaluate a job. Requires 4 frames:
- `job_id` - identifier of this job (in ASCII representation -- we avoid endianness issues and also
- `job_id` - identifier of this job (in ASCII representation -- we avoid endianness issues and also
support alphabetic ids)
support alphabetic ids)
- `header` - additional header describing worker capabilities. Format must be `header_name=value`, every header shall be in a separate message frame. There is no maximum limit on number of headers.
- `header` - additional header describing worker capabilities. Format must be `header_name=value`, every header shall be in a separate message frame. There is no maximum limit on number of headers.
@ -104,8 +127,10 @@ Commands from frontend to broker:
Commands from broker to frontend (all are responses to **eval** command):
Commands from broker to frontend (all are responses to **eval** command):
- **accept** - broker is capable of routing request to a worker
- **accept** - broker is capable of routing request to a worker
- **reject** - broker can't handle this job (for example when the requirements specified by the headers
- **reject** - broker can't handle this job (for example when the requirements
cannot be met)
specified by the headers cannot be met). There are (rare) cases when the
broker finds that it cannot handle the job after it's been confirmed. In such
cases it uses the frontend REST API to mark the job as failed.