From 7f6ccb4799834982d01ed79fdc30518a79886752 Mon Sep 17 00:00:00 2001 From: Simon Rozsival Date: Fri, 20 Jan 2017 11:03:42 +0100 Subject: [PATCH] API analysis -> implementation --- Rewritten-docs.md | 276 ++++++++++++++++++++-------------------------- 1 file changed, 118 insertions(+), 158 deletions(-) diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 872bac4..d55bf1d 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -1526,26 +1526,6 @@ framework is used. The Presenters are used to group the logic of the individual API endpoints. The routing mechanism is modified to distinguish the actions by both the URL and the HTTP method of the request. -#### Request handling - -A typical scenario for handling an API request is matching the HTTP request with -a corresponding handler routine which creates a response object, that is then -sent back to the client, encoded with JSON. The `Nette\Application` package can -be used to achieve this with Nette, although it is meant to be used mainly in -MVP applications. - -Matching HTTP requests with handlers can be done using standard Nette URL -routing -- we will create a Nette route for each API endpoint. Using the routing -mechanism from Nette logically leads to implementing handler routines as Nette -Presenter actions. Each presenter should serve logically related endpoints. - -The last step is encoding the response as JSON. In `Nette\Application`, HTTP -responses are returned using the `Presenter::sendResponse()` method. We decided -to write a method that calls `sendResponse` internally and takes care of the -encoding. This method has to be called in every presenter action. An alternative -approach would be using the internal payload object of the presenter, which is -more convenient, but provides us with less control. - #### Authentication To make certain data and actions acessible only for some specific users, there @@ -1582,138 +1562,6 @@ including generating the signature and signature verification is done through a widely used third-party library which lowers the risk of having a bug in the implementation of this critical security feature. -#### Forgotten password - -With authentication and some sort of dealing with passwords is related a problem -with forgotten credentials, especially passwords. There has to be some kind of -mechanism to retrieve a new password or change the old one. - -First, there are absolutely not secure and recommendable ways how to handle -that, for example sending the old password through email. A better, but still -not secure solution is to generate a new one and again send it through email. - -Mentioned solution was provided in CodEx, users had to write an email to -administrator, who generated a new password and sent it back to the sender. This -simple solution could be also automated, but administrator had quite a big -control over whole process. This might come in handy if there should be some -additional checkups, but on the other hand it can be quite time consuming. - -Probably the best solution which is often used and is fairly secure follows. Let -us consider only case in which all users have to fill their email addresses into -the system and these addresses are safely in the hands of the right users. - -When user finds out that he/she does not remember a password, he/she requests a -password reset and fill in his/her unique identifier; it might be email or -unique nickname. Based on matched user account the system generates unique -access token and sends it to user via email address. This token should be time -limited and usable only once, so it cannot be misused. User then takes the token -or URL address which is provided in the email and go to the system's appropriate -section, where new password can be set. After that user can sign in with his/her -new password. - -As previously stated, this solution is quite safe and user can handle it on its -own, so administrator does not have to worry about it. That is the main reason -why this approach was chosen to be used. - -#### Uploading files - -There are two cases when users need to upload files using the API -- submitting -solutions to an assignment and creating a new exercise. In both of these cases, -the final destination of the files is the fileserver. However, the fileserver is -not publicly accessible, so the files have to be uploaded through the API. - -The files can be either forwarded to the fileserver directly, without any -interference from the API server, or stored and forwarded later. We chose the -second approach, which is harder to implement, but more convenient -- it lets -exercise authors double-check what they upload to the fileserver and solutions -to assignments can be uploaded in a single request, which makes it easy for the -fileserver to create an archive of the solution files. - -#### Permissions - -In a system storing user data has to be implemented some kind of permission -checking. Previous chapters implies, that each user has to have a role, which -corresponds to his/her privileges. Our research showed, that three roles are -sufficient -- student, supervisor and administrator. The user role has to be -checked with every request. The good points is, that roles nicely match with -granularity of API endpoints, so the permission checking can be done at the -beginning of each request. That is implemented using PHP annotations, which -allows to specify allowed user roles for each request with very little of code, -but all the business logic is the same, together in one place. - -However, roles cannot cover all cases. For example, if user is a supervisor, it -relates only to groups, where he/she is a supervisor. But using only roles -allows him/her to act as supervisor in all groups in the system. Unfortunately, -this cannot be easily fixed using some annotations, because there are many -different cases when this problem occurs. To fix that, some additional checks -can be performed at the beginning of request processing. Usually it is only one -or two simple conditions. - -With this two concepts together it is possible to easily cover all cases of -permission checking with quite a small amount of code. - -#### Solution loading - -When a solution evaluation on the backend is finished, the results are saved to -the fileserver and the API is notified by the broker. Some further steps needs -to be done at that moment before the results can be presented to the users. -Some of these steps are parsing of the results, calculation of the final score, -or saving the structured data into the database. There are two main -possibilities when to process the results: - -- immediately after the API server is notified by the backend -- when a user requests the results for the first time - -These options are almost equal, none of them provides any kind of a big -advantage. Loading solutions immediately is better, because fetching results -by the client for the first time can be a bit faster as the results are already -processed. On the other hand, processing the results on demand can save some of -the resources when the solution results are not important (e.g., the student -finds a bug in his solution before the submission has been evaluated). - -We decided for the lazy loading at the time when the results are requested for -the first time. However, the concept of asynchronous jobs is then introduced. -This type of job is useful for batch submitting of jobs, for example re-running -jobs which failed on a worker hardware issue. These jobs are typically submitted -by different user than the author (an administrator for example), so the -original authors should be notified. In this case it is more reasonable to load -the results immediately and optionally send them a notification via an email. -This is exactly what we do. - -It seems with the benefit of hindsight that immediate loading of all jobs could -simplify the code and it has no major drawbacks. In the next version of ReCodEx -we will re-evaluate this decision. - -#### Communication with the backend - -##### Backend failure reporting - -The backend is a separate component which does not communicate with the -administrators directly. When it encounters an error it stores it in a log file. -It would be handy to inform the administrator directly at this moment so he can -fix the cause of the error as soon as possible. The backend does not have any -mechanism for notifying users using for example an email. The API server on the -other hand has email sending implemented and it can easily forward any messages -to the administrator. A secured communication protocol between the backend and -the frontend already exists (it is used for the reporting of a finished job -processing) and it is easy to add another endpoint for bug reporting. - -When a request for sending a report arrives from the backend then the type of -the report is inferred and if it is an error which deserves attention of the -administrator then an email is sent to him/her. There can also be errors which -are not that important (e.g., it was somehow solved by the backend itself or it -is only informative, then these do not have to be reported through an email but -can only be stored in the persistent database for further consideration. - -On top of that the separate backend component does not have to be exposed to the -outside network at all. - -If a job processing fails then the backend informs the API server which -initiated processing of the job. If an error which is not related to -job-processing occurs then the backend must communicate with a given API server -which is configured by the administrator while the other API servers which are -using the same backend are not informed. - ##### Backend state monitoring The next thing related to communication with the backend is monitoring its @@ -2791,6 +2639,112 @@ grouping of entities and how they are related: - submission + solution + reference solution + solution evaluation - comment threads + comments +#### Request handling + +A typical scenario for handling an API request is matching the HTTP request with +a corresponding handler routine which creates a response object, that is then +sent back to the client, encoded with JSON. The `Nette\Application` package can +be used to achieve this with Nette, although it is meant to be used mainly in +MVP applications. + +Matching HTTP requests with handlers can be done using standard Nette URL +routing -- we will create a Nette route for each API endpoint. Using the routing +mechanism from Nette logically leads to implementing handler routines as Nette +Presenter actions. Each presenter should serve logically related endpoints. + +The last step is encoding the response as JSON. In `Nette\Application`, HTTP +responses are returned using the `Presenter::sendResponse()` method. We decided +to write a method that calls `sendResponse` internally and takes care of the +encoding. This method has to be called in every presenter action. An alternative +approach would be using the internal payload object of the presenter, which is +more convenient, but provides us with less control. + +#### Authentication + +@todo + +#### Permissions + +In a system storing user data has to be implemented some kind of permission +checking. Each user has a role, which corresponds to his/her privileges. +Our research showed, that three roles are sufficient -- student, supervisor +and administrator. The user role has to be +checked with every request. The good points is, that roles nicely match with +granularity of API endpoints, so the permission checking can be done at the +beginning of each request. That is implemented using PHP annotations, which +allows to specify allowed user roles for each request with very little of code, +but all the business logic is the same, together in one place. + +However, roles cannot cover all cases. For example, if user is a supervisor, it +relates only to groups, where he/she is a supervisor. But using only roles +allows him/her to act as supervisor in all groups in the system. Unfortunately, +this cannot be easily fixed using some annotations, because there are many +different cases when this problem occurs. To fix that, some additional checks +can be performed at the beginning of request processing. Usually it is only one +or two simple conditions. + +With this two concepts together it is possible to easily cover all cases of +permission checking with quite a small amount of code. + +#### Uploading files + +There are two cases when users need to upload files using the API -- submitting +solutions to an assignment and creating a new exercise. In both of these cases, +the final destination of the files is the fileserver. However, the fileserver is +not publicly accessible, so the files have to be uploaded through the API. + +Each file is uploaded separately and is given a unique ID. The uploaded file +can then be attached to an exercise or a submitted solution of an exercise. +Storing and removing files from the server is done through the +`App\Helpers\UploadedFileStorage` class which maps the files to their records +in the database using the `App\Model\Entity\UploadedFile` entity. + +#### Forgotten password + +When user finds out that he/she does not remember a password, he/she requests a +password reset and fills in his/her unique email. A temporary access token is +generated for the user corresponding to the given email address and sent to this +address encoded in a URL leading to a client application. User then goes +to the URL and can choose a new password. + +The temporary token is generated and emailed by the +`App\Helpers\ForgottenPasswordHelper` class which is registered as a service +and can be injected into any presenter. + +This solution is quite safe and user can handle it on its own, so administrator +does not have to worry about it. + +#### Job configuration parsing and modifying + +@todo how the YAML is parsed +@todo how it can be changed and where it is used +@todo how it can be stored to a new YAML + +#### Solution loading + +When a solution evaluation is finished by the backend, the results are saved to +the fileserver and the API is notified by the broker. The results are parsed and +stored in the database. + +For results of reference solutions' evaluations and for asynchronous solution +evaluations (e.g., resubmitted by the administrator) the result is processed +right after the notification from backend is received and the author of the +solution will be notified by an email. + +When a student submits his/her solution directly through the client application +we do not parse the results right away but we postpone this until the student +(or a supervisor) wants to display the results for the first time. This may save +save some resources when the solution results are not important (e.g., the +student finds a bug in his solution before the submission has been evaluated). + +##### Parsing of the results + +The results are stored in a YAML file. We map the contents of the file to the +classes of the `App\Helpers\EvaluationResults` namespace. This process +validates the file and gives us access to all of the information through +an interface of a class and not only using associative arrays. This is very +similar to how the job configuration files are processed. + ### API endpoints @todo: Tell the user about the generated API reference and how the @@ -3024,18 +2978,24 @@ as a server, its IP address and port is configurable in the API. specified by the headers cannot be met). There are (rare) cases when the broker finds that it cannot handle the job after it was confirmed. In such cases it uses the frontend REST API to mark the job as failed. - #### Asynchronous communication between broker and API Only a fraction of the errors that can happen during evaluation can be detected while there is a ZeroMQ connection between the API and broker. To notify the -frontend of the rest, we need an asynchronous communication channel that can be -used by the broker when the status of a job changes (it's finished, it failed -permanently, the only worker capable of processing it disconnected...). +frontend of the rest, the API exposes an endpoint for the broker for this purpose. +Broker uses this endpoint whenever the status of a job changes (it's finished, +it failed permanently, the only worker capable of processing it disconnected...). + +When a request for sending a report arrives from the backend then the type of +the report is inferred and if it is an error which deserves attention of the +administrator then an email is sent to him/her. There can also be errors which +are not that important (e.g., it was somehow solved by the backend itself or it +is only informative), then these do not have to be reported through an email but +they are stored in the persistent database for further consideration. -This functionality is supplied by the `broker-reports/` API endpoint group -- -see its documentation for more details. +For the details of this interface please refer to the attached API documentation +and the `broker-reports/` endpoint group. ### File Server - Web API communication