master
Teyras 8 years ago
parent 30d440fa76
commit 6b0937028c

@ -11,7 +11,11 @@
## Communication ## Communication
Detailed communication inside the ReCodEx project is captured in the following image and described in sections below. Red connections are through ZeroMQ sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ messages are sent as multipart with one string (command, option) per part, with no empty frames (unles explicitly specified otherwise). Detailed communication inside the ReCodEx system is captured in the following
image and described in sections below. Red connections are through ZeroMQ
sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ
messages are sent as multipart with one string (command, option) per part, with
no empty frames (unles explicitly specified otherwise).
![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png) ![Communication schema](https://github.com/ReCodEx/wiki/raw/master/images/Backend_Connections.png)
@ -88,26 +92,40 @@ interval is configurable on both sides -- future releases might let the worker
send its ping interval with the **init** command). Upon receiving a **ping** send its ping interval with the **init** command). Upon receiving a **ping**
command, the broker responds with **pong**. command, the broker responds with **pong**.
Both sides keep track of missing heartbeating messages since the last one was Whenever a heartbeating message doesn't arrive, a counter called _liveness_ is
received. When this number reaches a threshold (called maximum liveness), the decreased. When this counter drops to zero, the other side is considered
other side is considered dead. disconnected. When a message arrives, the liveness counter is set back to its
maximum value, which is configurable for both sides.
When the broker decides a worker died, it tries to reschedule its jobs to other When the broker decides a worker disconnected, it tries to reschedule its jobs
workers. to other workers.
If a worker thinks the broker is dead, it tries to reconnect with a bounded, If a worker thinks the broker crashed, it tries to reconnect periodically, with
exponentially increasing delay. a bounded, exponentially increasing delay.
This protocol proved great robustness in real world testing. Thus whole backend is really reliable and can outlive short term issues with connection without problems. Also, increasing delay of ping messages does not flood the network when there are problems. We experienced no issues since we are using this protocol.
This protocol proved great robustness in real world testing. Thus whole backend
is reliable and can outlive short term issues with connection without problems.
Also, increasing delay of ping messages does not flood the network when there
are problems. We experienced no issues since we are using this protocol.
### Worker - File Server communication ### Worker - File Server communication
Worker is communicating with file server only from _execution thread_. Supported protocol is HTTP optionally with SSL encryption (**recommended**, you can get free trusted DV certificate from [Let's Encrypt](https://letsencrypt.org/) authority if you have not one yet). If supported by server and used version of libcurl, HTTP/2 standard is also available. File server should be set up to require basic HTTP authentication and worker is capable to send corresponding credentials with each request. Worker is communicating with file server only from _execution thread_. Supported
protocol is HTTP optionally with SSL encryption (**recommended**). If supported
by server and used version of libcurl, HTTP/2 standard is also available. File
server should be set up to require basic HTTP authentication and worker is
capable to send corresponding credentials with each request.
#### Worker side #### Worker side
Worker is cabable of 2 things -- download file and upload file. Internally, worker is using libcurl C library with very similar setup. In both cases it can verify HTTPS certificate (on Linux against system cert list, on Windows against downloaded one from CURL website during installation), support basic HTTP authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error (returned HTTP status code is >= 400). Worker have list of credentials to all available file servers in its config file. Workers comunicate with the file server in both directions -- they download
student's submissions and then upload evaluation results. Internally, worker is
using libcurl C library with very similar setup. In both cases it can verify
HTTPS certificate (on Linux against system cert list, on Windows against
downloaded one from CURL website during installation), support basic HTTP
authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error
(returned HTTP status code is >=400). Worker have list of credentials to all
available file servers in its config file.
- download file -- standard HTTP GET request to given URL expecting file content as response - download file -- standard HTTP GET request to given URL expecting file content as response
- upload file -- standard HTTP PUT request to given URL with file data as body -- same as command line tool `curl` with option `--upload-file` - upload file -- standard HTTP PUT request to given URL with file data as body -- same as command line tool `curl` with option `--upload-file`

Loading…
Cancel
Save