5.1 KiB
Communication
This section gives detailed overview about communication in ReCodEx solution. Basic concept is captured on following image:
Red connections are through ZeroMQ sockets, Blue are through WebSockets and Green are through HTTP. All ZeroMQ messages are sent as multipart with one string (command, option) per part, with no empty frames (unles explicitly specified otherwise).
Internal worker communication
Communication between the two worker threads is split into two separate parts, each one holding dedicated connection line. These internal lines are realized by ZeroMQ inproc PAIR sockets. For this section assume that the thread of the worker which communicates with broker is called listening thread and the other one, which is evaluating incoming jobs is called job thread. Listening thread is at both cases server (here is called bind()
method), but because of ZeroMQ function it's not much important (connect()
call in clients can precede server bind()
call with no issue).
Main communication
Main communication is on inproc://jobs
sockets. Listening thread is waiting for any messages (from broker, jobs and progress sockets) and handle incoming requests properly.
Commands from listening thread to job thread:
- eval - evaluate a job. Requires 3 arguments:
job_id
- identifier of this jobjob_url
- URI location of archive with job configuration and submitted source coderesult_url
- remote URI where results will be pushed to
Commands from job thread to listening thread:
- done - notifying of finished job. Requires 2 arguments:
job_id
- identifier of finished jobresult
- response result, one of "OK" and "ERR"
Progress callback
Progress messages are sent through inproc://progress
sockets. This is only one way communication from job thread to the listening thread.
Commands:
- progress - notice about evaluation progress. Requires 2 or 4 arguments:
job_id
- identifier of current jobstate
- what is happening now. One of "DOWNLOADED" (submission successfuly fetched), "UPLOADED" (results are uploaded to fileserver), "STARTED" (evaluation started), "ENDED" (evaluation is finnished) and "TASK" (task state changed - see below)task_id
- only present for "TASK" state - identifier of task in current jobtask_state
- only present for "TASK" state - result of task evaluation. One of "COMPLETED" and "FAILED".
Broker - Worker communication
Broker is server when comminicating with worker. IP address and port are configurable, protocol is TCP. Worker socket is DEALER, broker one is ROUTER type.
Commands from broker to worker:
- eval - evaluate a job. See eval command in Communication#main-communication.
- intro - introduce yourself to the broker (with init command)
- pong - reply to ping command, no arguments
Commands from worker to broker:
- init - introduce yourself to the broker. Useful on startup or after reestablishing lost connection. Requires at least two arguments:
hwgroup
- hardware group of this workerheader
- additional header describing worker capabilities. Format must beheader_name=value
, every header shall be in a separate message frame. There is no maximum limit on number of headers.
- done - job evaluation finished, see done command in Communication#main-communication.
- progress - evaluation progress report, see progress command in Communication#progress-callback
- ping - tell broker I'm alive, no arguments
Worker - File Server communication
Broker - Monitor communication
Broker - Frontend communication
The communication between the frontend and the workers is mediated by a broker that passes jobs to workers capable of processing them.
Assignment evaluation request
The frontend must send a multipart message that contains the following frames:
- The
eval
command - The job id (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
- A frame for each header (e.g.
hwgroup=group_1
) - An URL of the archive that contains the submitted files and isoeval configuration
- An URL where the worker should store the result of the evaluation
If the broker is capable of routing the request to a worker, it responds with
accept
. Otherwise (for example when the requirements specified by the headers
cannot be met), it responds with reject
.
Note that we will need to store the job ID and the assignment configuration somewhere close to the submitted files so it's possible to check how a submission was evaluated. The job ID will likely be a part of the submission's path. The configuration could be linked there under some well-known name.
Notifying the frontend about evaluation progress
The script that requested the evaluation will have exited by the time a worker processes the request. This issue remains to be resolved.
File Server - Frontend communication
Monitor - Browser communication
Frontend - Browser communication
TODO: