From cd071f19743d4b7f1e017039bc512c4a037c6be0 Mon Sep 17 00:00:00 2001 From: Martin Polanka Date: Tue, 10 Jan 2017 19:43:31 +0100 Subject: [PATCH] backend error reporting --- Rewritten-docs.md | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 593f12d..44c347f 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -1499,9 +1499,37 @@ within instance and how it is implemented and how it could be implemented #### Backend management -Considering the fact that we have backend as a separate component which has no clue about administrators and uses only logging as some kind of failure reporting. It can be handy to provide this functionality to backend from frontend which manages users. The simplest solution would be again to have separate component with some sort of public interface. It can be for example REST or some other communication which backend can handle. Functionality of this kind of component is then quite easy. When request for report arrives from backend then type is inferred and if it is error which deserves attention of administrator then email is sent to him/her. There can also be errors which are not that important, was somehow solved by backend itself or are only informative, these do not have to be reported by email but only stored in persistent database for further consideration. On top of that separate component can be internal and not exposed to outside network. Disadvantage is that database layer which is used in some particular API instance cannot be used here because multiple instances of API can use one backend. - -@todo: rest api is used for report of backend state and errors, describe why +Considering the fact that we have backend as a separate component which has no +clue about administrators and uses only logging as some kind of failure +reporting. It can be handy to provide this functionality to backend from +frontend which manages users. The simplest solution would be again to have +separate component with some sort of public interface. It can be for example +REST or some other communication which backend can handle. Functionality of this +kind of component is then quite easy. When request for report arrives from +backend then type is inferred and if it is error which deserves attention of +administrator then email is sent to him/her. There can also be errors which are +not that important, was somehow solved by backend itself or are only +informative, these do not have to be reported by email but only stored in +persistent database for further consideration. On top of that separate component +can be internal and not exposed to outside network. Disadvantage is that +database layer which is used in some particular API instance cannot be used here +because multiple instances of API can use one backend. + +Another solution which was at the end implemented is to integrate backend +failure reporting feature to API. Problem with previous one is that if job +execution fails backend has to report this error to some particular API server +from which request for evaluation came. This information is essential and has to +be stored there and not in some general component and general error database. +Obviously if there are multiple API servers connected to one backend there has +to be some API server configured in backend as the main one which receives +reports about general backend errors which are not connected to jobs. This +solution was chosen because as stated we have to implement job error reporting +in API and having separate component only for general errors is not feasible. In +the end error reporting should be available under different route which is +secured by basic HTTP authentication, because basic authentication is easy +enough to implement in low-level backend components. That also means this +feature is visible and can be exploited but from our points of view it seems as +appropriate compromise in simplicity. @todo: where is stored which workers can be used by supervisors and which runtimes are available, describe possibilities and why is not implemented automatic solution