Monitor implementation

master
Petr Stefan 8 years ago
parent 69e3a17429
commit be4790a9da

@ -1,30 +0,0 @@
# Monitor
## Description
Monitor is part of ReCodEx solution for reporting progress of job evaluation back to user in the real time. It gets progress notifications from broker and sends them through WebSockets to clients' browsers. For now, it is meant as an optional part of whole solution, but for full experince it is recommended to use one.
Monitor is needed one per broker, that is one per separate ReCodEx instance. Also, monitor has to be publicly visible (has to have public IP address or be behind public proxy server) and also needs a connection to the broker. If the web application is using HTTPS, it is required to use a proxy for monitor to provide encryption over WebSockets. If this is not done, browsers of the users will block unencrypted connection and will not show the progress to the users.
## Architecture
Monitor is written in Python, tested versions are 3.4 and 3.5. This language was chosen because it is already in project requirements (fileserver) and there are great libraries for ZeroMQ, WebSockets and asynchronous operations. This library saves system resources and provides us great amount of processed messages. Also, coding in Python was pretty simple and saves us time for improving the other parts of ReCodEx.
### Message flow
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all components (logger for example), starts the other thread and runs the ZeroMQ part of the application. This thread receives and parses incomming messages from broker and forwards them to _thread 2_ sending logic.
_Thread 2_ is responsible for managing all of WebSocket connections asynchronously. Whole thread is one big _asyncio_ event loop through which all actions are processed. None of custom data types in Python are thread-safe, so all events from other threads (actually only `send_message` method invocation) must be called within the event loop (via `asyncio.loop.call_soon_threadsafe` function). Please note, that most of the Python interpreters use [Global Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there is actualy no parallelism in the performance point of view, but proper synchronization is still required!
### Handling of incomming messages
Incomming ZeroMQ progress message is received and parsed to JSON format (same as our WebSocket communication format). JSON string is then passed to _thread 2_ for asynchronous sending. Each message has an identifier of channel where to send it to.
There can be multiple receivers to one channel id. Each one has separate _asyncio.Queue_ instance where new messages are added. In addition to that, there is one list of all messages per channel. If a client connects a bit later than the point when monitor starts to receive messages, it will receive all messages from the beginning. Messages are stored 5 minutes after last progress command (normally FINISHED) is received, then are permanently deleted.
Messages from client's queue are sent through corresponding WebSocket connection via main event loop as soon as possible. This approach with separate queue per connection is easy to implement and guarantees reliability and order of message delivery.

@ -1438,26 +1438,7 @@ callback.
There are several possibilities how to write the component. Notably, considered There are several possibilities how to write the component. Notably, considered
options were already used languages C++, PHP, JavaScript and Python. At the end, options were already used languages C++, PHP, JavaScript and Python. At the end,
the Python language was chosen for its simplicity, great support for all used the Python language was chosen for its simplicity, great support for all used
technologies and also there are free Python developers in out team. Then, technologies and also there are free Python developers in out team.
responsibility of this component is determined. Concept of message flow is on
following picture.
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
The message channel inputing the monitor uses ZeroMQ as main message framework
used by backend. This decision keeps rest of backend aware of used
communication protocol and related libraries. Output channel is WebSocket as a
protocol for sending messages to web browsers. In Python, there are several
WebSocket libraries. The most popular one is `websockets` in cooperation with
`asyncio`. This combination is easy to use and well documented, so it is used in
monitor component too. For ZeroMQ, there is `zmq` library with binding to
framework core in C++.
Incoming messages are cached for short period of time. Early testing shows,
that backend can start sending progress messages sooner than client connects to
the monitor. To solve this, messages for each job are hold 5 minutes after
reception of last message. The client gets all already received messages at time
of connection with no message loss.
### API server ### API server
@ -2578,7 +2559,6 @@ used.
# Implementation # Implementation
## Broker ## Broker
@todo: gets stuff done, single point of failure and center point of ReCodEx universe @todo: gets stuff done, single point of failure and center point of ReCodEx universe
@ -2668,7 +2648,54 @@ calling and return values.
## Monitor ## Monitor
@todo: not necessary component which can be omitted, proxy-like service Monitor is optional part of the ReCodEx solution for reporting progress of job
evaluation back to users in the real time. It is written in Python, tested
versions are 3.4 and 3.5. There is just one monitor instance required per
broker. Also, monitor has to be publicly visible (has to have public IP address
or be behind public proxy server) and also needs a connection to the broker. If
the web application is using HTTPS, it is required to use a proxy for monitor to
provide encryption over WebSockets. If this is not done, browsers of the users
will block unencrypted connection and will not show the progress to the users.
### Message flow
![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png)
Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all
components (logger for example), starts the other thread and runs the ZeroMQ
part of the application. This thread receives and parses incomming messages from
broker and forwards them to _thread 2_ sending logic.
_Thread 2_ is responsible for managing all of WebSocket connections
asynchronously. Whole thread is one big _asyncio_ event loop through which all
actions are processed. None of custom data types in Python are thread-safe, so
all events from other threads (actually only `send_message` method invocation)
must be called within the event loop (via `asyncio.loop.call_soon_threadsafe`
function). Please note, that most of the Python interpreters use [Global
Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there
is actualy no parallelism in the performance point of view, but proper
synchronization is still required.
### Handling of incomming messages
Incomming ZeroMQ progress message is received and parsed to JSON format (same as
our WebSocket communication format). JSON string is then passed to _thread 2_
for asynchronous sending. Each message has an identifier of channel where to
send it to.
There can be multiple receivers to one channel id. Each one has separate
_asyncio.Queue_ instance where new messages are added. In addition to that,
there is one list of all messages per channel. If a client connects a bit later
than the point when monitor starts to receive messages, it will receive all
messages from the beginning. Messages are stored 5 minutes after last progress
command (normally FINISHED) is received, then are permanently deleted. This
caching mechanism was implemented because early testing shows, that first couple
of messages are missed quite often.
Messages from client's queue are sent through corresponding WebSocket connection
via main event loop as soon as possible. This approach with separate queue per
connection is easy to implement and guarantees reliability and order of message
delivery.
## Cleaner ## Cleaner

Loading…
Cancel
Save