From be4790a9daf2c3ef4f45b77d2a57a5661028a9be Mon Sep 17 00:00:00 2001 From: Petr Stefan Date: Sun, 22 Jan 2017 14:30:30 +0100 Subject: [PATCH] Monitor implementation --- Monitor.md | 30 -------------------- Rewritten-docs.md | 71 ++++++++++++++++++++++++++++++++--------------- 2 files changed, 49 insertions(+), 52 deletions(-) delete mode 100644 Monitor.md diff --git a/Monitor.md b/Monitor.md deleted file mode 100644 index 58f3e37..0000000 --- a/Monitor.md +++ /dev/null @@ -1,30 +0,0 @@ -# Monitor - -## Description - -Monitor is part of ReCodEx solution for reporting progress of job evaluation back to user in the real time. It gets progress notifications from broker and sends them through WebSockets to clients' browsers. For now, it is meant as an optional part of whole solution, but for full experince it is recommended to use one. - -Monitor is needed one per broker, that is one per separate ReCodEx instance. Also, monitor has to be publicly visible (has to have public IP address or be behind public proxy server) and also needs a connection to the broker. If the web application is using HTTPS, it is required to use a proxy for monitor to provide encryption over WebSockets. If this is not done, browsers of the users will block unencrypted connection and will not show the progress to the users. - - -## Architecture - -Monitor is written in Python, tested versions are 3.4 and 3.5. This language was chosen because it is already in project requirements (fileserver) and there are great libraries for ZeroMQ, WebSockets and asynchronous operations. This library saves system resources and provides us great amount of processed messages. Also, coding in Python was pretty simple and saves us time for improving the other parts of ReCodEx. - - -### Message flow - -![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png) - -Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all components (logger for example), starts the other thread and runs the ZeroMQ part of the application. This thread receives and parses incomming messages from broker and forwards them to _thread 2_ sending logic. - -_Thread 2_ is responsible for managing all of WebSocket connections asynchronously. Whole thread is one big _asyncio_ event loop through which all actions are processed. None of custom data types in Python are thread-safe, so all events from other threads (actually only `send_message` method invocation) must be called within the event loop (via `asyncio.loop.call_soon_threadsafe` function). Please note, that most of the Python interpreters use [Global Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there is actualy no parallelism in the performance point of view, but proper synchronization is still required! - -### Handling of incomming messages - -Incomming ZeroMQ progress message is received and parsed to JSON format (same as our WebSocket communication format). JSON string is then passed to _thread 2_ for asynchronous sending. Each message has an identifier of channel where to send it to. - -There can be multiple receivers to one channel id. Each one has separate _asyncio.Queue_ instance where new messages are added. In addition to that, there is one list of all messages per channel. If a client connects a bit later than the point when monitor starts to receive messages, it will receive all messages from the beginning. Messages are stored 5 minutes after last progress command (normally FINISHED) is received, then are permanently deleted. - -Messages from client's queue are sent through corresponding WebSocket connection via main event loop as soon as possible. This approach with separate queue per connection is easy to implement and guarantees reliability and order of message delivery. - diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 4080ac0..2e38170 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -1438,26 +1438,7 @@ callback. There are several possibilities how to write the component. Notably, considered options were already used languages C++, PHP, JavaScript and Python. At the end, the Python language was chosen for its simplicity, great support for all used -technologies and also there are free Python developers in out team. Then, -responsibility of this component is determined. Concept of message flow is on -following picture. - -![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png) - -The message channel inputing the monitor uses ZeroMQ as main message framework -used by backend. This decision keeps rest of backend aware of used -communication protocol and related libraries. Output channel is WebSocket as a -protocol for sending messages to web browsers. In Python, there are several -WebSocket libraries. The most popular one is `websockets` in cooperation with -`asyncio`. This combination is easy to use and well documented, so it is used in -monitor component too. For ZeroMQ, there is `zmq` library with binding to -framework core in C++. - -Incoming messages are cached for short period of time. Early testing shows, -that backend can start sending progress messages sooner than client connects to -the monitor. To solve this, messages for each job are hold 5 minutes after -reception of last message. The client gets all already received messages at time -of connection with no message loss. +technologies and also there are free Python developers in out team. ### API server @@ -2578,7 +2559,6 @@ used. # Implementation - ## Broker @todo: gets stuff done, single point of failure and center point of ReCodEx universe @@ -2668,7 +2648,54 @@ calling and return values. ## Monitor -@todo: not necessary component which can be omitted, proxy-like service +Monitor is optional part of the ReCodEx solution for reporting progress of job +evaluation back to users in the real time. It is written in Python, tested +versions are 3.4 and 3.5. There is just one monitor instance required per +broker. Also, monitor has to be publicly visible (has to have public IP address +or be behind public proxy server) and also needs a connection to the broker. If +the web application is using HTTPS, it is required to use a proxy for monitor to +provide encryption over WebSockets. If this is not done, browsers of the users +will block unencrypted connection and will not show the progress to the users. + +### Message flow + +![Message flow inside montior](https://raw.githubusercontent.com/ReCodEx/wiki/master/images/Monitor_arch.png) + +Monitor runs in 2 threads. _Thread 1_ is the main thread, which initializes all +components (logger for example), starts the other thread and runs the ZeroMQ +part of the application. This thread receives and parses incomming messages from +broker and forwards them to _thread 2_ sending logic. + +_Thread 2_ is responsible for managing all of WebSocket connections +asynchronously. Whole thread is one big _asyncio_ event loop through which all +actions are processed. None of custom data types in Python are thread-safe, so +all events from other threads (actually only `send_message` method invocation) +must be called within the event loop (via `asyncio.loop.call_soon_threadsafe` +function). Please note, that most of the Python interpreters use [Global +Interpreter Lock](https://wiki.python.org/moin/GlobalInterpreterLock), so there +is actualy no parallelism in the performance point of view, but proper +synchronization is still required. + +### Handling of incomming messages + +Incomming ZeroMQ progress message is received and parsed to JSON format (same as +our WebSocket communication format). JSON string is then passed to _thread 2_ +for asynchronous sending. Each message has an identifier of channel where to +send it to. + +There can be multiple receivers to one channel id. Each one has separate +_asyncio.Queue_ instance where new messages are added. In addition to that, +there is one list of all messages per channel. If a client connects a bit later +than the point when monitor starts to receive messages, it will receive all +messages from the beginning. Messages are stored 5 minutes after last progress +command (normally FINISHED) is received, then are permanently deleted. This +caching mechanism was implemented because early testing shows, that first couple +of messages are missed quite often. + +Messages from client's queue are sent through corresponding WebSocket connection +via main event loop as soon as possible. This approach with separate queue per +connection is easy to implement and guarantees reliability and order of message +delivery. ## Cleaner