Introduction ============ Generally, there are many different ways and opinions on how to teach people something new. However, most people agree that a hands-on experience is one of the best ways to make the human brain remember a new skill. Learning must be entertaining and interactive, with fast and frequent feedback. Some kinds of knowledge are more suitable for this practical type of learning than others, and fortunately, programming is one of them. University education system is one of the areas where this knowledge can be applied. In computer programming, there are several requirements such as the code being syntactically correct, efficient and easy to read, maintain and extend. Correctness and efficiency can be tested automatically to help teachers save time for their research, but checking for bad design, habits and mistakes is really hard to automate and requires manpower. Checking programs written by students takes a lot of time and requires a lot of mechanical, repetitive work. The first idea of an automatic evaluation system comes from Stanford University profesors in 1965. They implemented a system which evaluated code in Algol submitted on punch cards. In following years, many similar products were written. There are two basic ways of automatically evaluating code -- statically (check the code without running it; safe, but not much precise) or dynamically (run the code on testing inputs with checking the outputs against reference ones; needs sandboxing, but provides good real world experience). This project focuses on the machine-controlled part of source code evaluation. First, problems of present software at our university were discussed and similar projects at other educational institutions were examined. With acquired knowledge from such projects in production, we set up goals for the new evaluation system, designed the architecture and implemented a fully operational solution. The system is now ready for production testing at our university. Analysis -------- ### Assignment The major goal of this project is to create a grading application that will be used for programming classes at the Faculty of Mathematics and Physics, Charles University. However, the application should be designed in a modular fashion so that it can be easily extended to make other ways of using it possible. The project has a great starting point -- there is an old grading system currently used at our university (CodEx), so its mistakes and weaknesses can be adressed. Furthermore, many teachers are willing to use and test the new system. Following requirements were collected both from our personal experience with CodEx and from teachers' requests. **Basic grading system requirements:** These are features that are necessary for any system for evaluation of programming homework assignments used in a university programming course. - creating exercises including textual description, sample inputs and correct reference outputs (for example "sum all numbers from given file and write the result to the standard output") - assigning the exercise to a group of users with some additional properties set (deadlines, etc.) - user interface for interaction with the system, mainly for showing assigned exercises, uploading solution sources and presenting evaluated results - safe environment to execute student solutions withing prescribed time and memory limits and check corectness of outputs - assigning points to users depending of correctness of his/her solution - user management with support of roles (at least two -- _student_ and _supervisor_) - administrative interface for manual checking of solutions, overriding automatically assigned amount of points and viewing of overall statistics about users CodEx satisfies all these requirements and a few more that originate from the way courses are organized at our university -- for example, users have roles (_student_, _supervisor_ and _administrator_) that determine their capabilities in the system and students are divided into groups that correspond to lab groups. However, further requirements arose during the ten year long lifetime of the old system. There are not many ways to improve it from the perspective of a student, but a lot of feature requests came from administrators and supervisors. Collected ideas were mostly gathered from meetings with faculty staff involved with the current system. **Requested features for the new system:** - logging in through a university authentication system (e.g. LDAP) - support for multiple programming environments at once to avoid unacceptable workload for administrator (maintain separate installations for many courses) and high hardware occupation - localization (both UI and exercises) - Markdown support for exercise texts - tagging exercises and search by tags - comments, comments, comments (exercises, tests, solutions, ...) - edit student solution and privately resubmit it - resubmit solution with saving all results (including temporary ones) - mark one student solution as accepted (used for grading this assignment) - web and command-line submit tool - SIS (university information system) integration for fetching personal user data - plagiarism detection - advanced low-level evaluation flow configuration with high-level abstraction layer for ordinary configuration cases - use of modern technologies with state-of-the-art compilers The survey shows that the system is used in many different ways, but the core functionality is the same for all of them. When the system is ready, it is likely that new ideas are figured out, thus the system must be designed to be easily extendable, so everyone can develop his dream feature. This also means, that widely used programming languages and techniques should be used, so users can quickly understand the code and make changes. To find out current state in the field of automatic grading systems, let's do a short survey at universities, programming contests or online tools. ### Related work First of all, some code evaluating projects were found and examined. It is not a complete list of such evaluators, but just a few projects which are used these days and can be an inspiration for our project. Each project from the list has a brief description and some key features mentioned. #### CodEx There already is a grading solution at MFF UK, which was implemented in 2006 by group of students. Its name is [CodEx -- The Code Examiner](http://codex.ms.mff.cuni.cz/project/) and it has been used with some improvements since then. The original plan was to use the system only for basic programming courses, but there is demand for adapting it for many different subjects. CodEx is based on dynamic analysis. It features a web-based interface, where supervisors assign exercises to their students and the students have a time window to submit the solution. Each solution is compiled and run in sandbox (MO-Eval). The metrics which are checked are: corectness of the output, time and memory limits. It supports programs written in C, C++, C#, Java, Pascal, Python and Haskell. Current system is old, but robust. There were no major security incidents during its production usage. However, from today's perspective there are several drawbacks. The main ones are: - **web interface** -- The web interface is simple and fully functional. But rapid development in web technologies opens new horizons of how web interface can be made. - **web api** -- CodEx offers a very limited XML API based on outdated technologies that is not sufficient for users who would like to create custom interfaces such as a command line tool or mobile application. - **sandboxing** -- MO-Eval sandbox is based on principle of monitoring system calls and blocking the bad ones. This can be easily done for single-threaded applications, but proves difficult with multi-threaded ones. In present day, parallelism is a very important area of computing, so there is requirement to test multi-threaded applications too. - **instances** -- Different ways of CodEx usage scenarios requires separate instances (Programming I and II, Java, C#, etc.). This configuration is not user friendly (students have to register in each instance separately) and burdens administrators with unnecessary work. CodEx architecture does not allow sharing hardware between instances, which results in an inefficient use of hardware for evaluation. - **task extensibility** -- There is a need to test and evaluate complicated programs for classes such as Parallel programming or Compiler principles, which have a more difficult evaluation chain than simple compilation/execution/evaluation provided by CodEx. #### Progtest [Progtest](https://progtest.fit.cvut.cz/) is private project from FIT ČVUT in Prague. As far as we know it is used for C/C++, Bash programming and knowledge-based quizzes. There are several bonus points and penalties and also a few hints what is failing in submitted solution. It is very strict on source code quality, for example `-pedantic` option of GCC, Valgrind for memory leaks or array boundaries checks via `mudflap` library. #### Codility [Codility](https://codility.com/) is web based solution primary targeted to company recruiters. It is commercial product of SaaS type supporting 16 programming languages. The [UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png) of Codility is [opensource](https://github.com/Codility/cui), the rest of source code is not available. One interesting feature is 'task timeline' -- captured progress of writing code for each user. #### CMS [CMS](http://cms-dev.github.io/index.html) is an opensource distributed system for running and organizing programming contests. It is written in Python and contain several modules. CMS supports C/C++, Pascal, Python, PHP and Java. PostgreSQL is a single point of failure, all modules heavily depend on database connection. Task evaluation can be only three step pipeline -- compilation, execution, evaluation. Execution is performed in [Isolate](https://github.com/ioi/isolate), sandbox written by consultant of our project, Mgr. Martin Mareš, Ph.D. #### MOE [MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C and Python. It does not provide a default GUI interface, all actions have to be performed from command line. The system does not evaluate submissions in real time, results are computed in batch mode after exercise deadline, using Isolate for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but the system is generally obsolete. #### Kattis [Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean and functional web UI, but the rest of the application is too simple. A nice feature is the usage of a [standardized format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for exercises. Kattis is primarily used by programming contest organizators, company recruiters and also some universities. ### ReCodEx goals From the survey above it is clear, that none of the existing systems is capable of all the features collected for the new system. No grading system is designed to support complicated evaluation pipeline, so this part is unexplored field and has to be designed with caution. Also, no project is modern and extendable in a way that it can be used as a base for ReCodEx. After considering all these facts, it is clear that the new system has to be written from scratch. This implies, that only subset of all features will be implemented in the first version, the others following later. The new project is **ReCodEx -- ReCodEx Code Examiner**. The name should point to CodEx, previous evaluation solution, but also reflect new approach to solve issues. **Re** as part of the name means redesigned, rewritten, renewed or restarted. At this point there is a clear idea how the new system will be used and what are major enhancements for future releases. With this in mind, the overall architecture can be sketched. From the previous research, we set up several goals, which a new system should have. They mostly reflect drawbacks of current version of CodEx and reasonable wishes of university users. Most notable features are following: - modern HTML5 web frontend written in Javascript using a suitable framework - REST API implemented in PHP, communicating with database, backend and file server - backend is implemented as distributed system on top of message queue framework (ZeroMQ) with master-worker architecture - worker with basic support of Windows environment (without sandbox, no general purpose suitable tool available yet) - evaluation procedure configured in YAML file, compound of small tasks connected into arbitrary oriented acyclic graph #### Intended usage Whole system is intended to help both supervisors and students. To achieve this, it is crucial to keep in mind typical usage scenarios of the system and try to make these typical tasks as simple as possible. To synchronize visions of readers, basic concepts are recapitulated. First of all, the system has database of users. Each user has assigned a role, which correspond to his/her privileges. User can be logged in via local authentication service or university system. There are groups of users, which corresponds to lectured courses. Groups can be hierarchically ordered to reflect additional metadata like academic year. For example, reasonable group hierarchy is like this: ``` Summer term 2016 ├── Language C# and .NET platform │   ├── Labs Monday 10:30 │   └── Labs Thursday 9:00 ├── Programming I │   ├── Labs Monday 14:00 ... ``` In this example, student users are part of the leaf groups, higher groups are just for keeping related groups together. The hierarchy tree can be modified and altered to fit specific needs for each organization, even the flat structure is possible. One user can be part of multiple groups and also one group can have multiple users. Each user in a group has a role which defines its capabilities. Priviledged user can assign a new exercise in his/her group, change assignment details, view results of other users and manually change them. Normal user can join a group, get list of assigned exercises, view assignment detail, submit his/her solution and of course view the results. Database of exercises (algorithmic problems) is another part of the project. Each exercise consists of text in multiple language variants, evaluation configuration and set of inputs and reference outputs. Exercises are created by instructed priviledged users. Assigning exercise to a group means choose one of the exercises in the list and specify additional data. Assignment has a deadline, maximum amount of points and configuration for calculating the final amount, number of tries and supported runtimes (programming languages) including specific time and memory limits for sandboxed tasks. ##### Exercise evaluation chain The most important part of the application is evaluating exercises for solutions submitted by users. For imaginary system architecture _UI_, _API_, _Broker_ and _Worker_ this goes as follows. First thing users have to do is to submit their solutions to _UI_ which provides interface to upload files and then submit them. _UI_ sends a request to _API_ that user wants to evaluate assignment with provided files. _API_ checks the assignment invariants (deadlines, count of submissions, ...) and stores submitted files. The runtime environment is automatically detected based on input files and suitable exercise configuration variant is chosen (one exercise can have multiple variants, for example C and Java languages). Matching exercise configuration is then send to _Broker_ alongside solution source files. _Broker_ has to find suitable _Worker_ for execution of this particular submission. This decission is made based on capabilities of each _Worker_ and job requirements. When a match is found, the job is held until the _Worker_ is jobless and can receive an evaluation request. _Worker_ gets evaluation request with source files and job configuration. The configuration is parsed into small tasks with simple piece of work. Evaluation itself goes in direction of tasks ordering. It is crucial to keep _Worker_ machine secure and stable, so isolated sandboxed environment is used when dealing with unknown source code. When the execution is finished, results are uploaded back. _API_ is notified about finished job from _Broker_. The results are parsed and results of important tasks (comparing actual and expected results) saved into database. Also, points are calculated depending on solution correctness and assignment configuration. _UI_ then only displays results summary fetched from the _API_. Presented data includes overview which part succeeded and which failed (optionally with reason like "memory limit exceeded") and amount of awarded points. ### Solution concepts analysis @todo: what problems were solved on abstract and high levels, how they can be solved and what was the final solution - which problems are they? #### Structure of the project @todo: move "General backend implementation" here @todo: move "General frontend implementation" here The ReCodEx project is divided into two logical parts – the *Backend* and the *Frontend* – which interact which each other and which cover the whole area of code examination. Both of these logical parts are independent of each other in the sense of being installed on separate machines on different locations and that one of the parts can be replaced with different implementation and as long as the communication protocols are preserved, the system will continue to work as expected. *Backend* is the part which is responsible solely for the process of evaluation a solution of an exercise. Each evaluation of a solution is referred to as a *job*. For each job, the system expects a configuration document of the job, supplementary files for the exercise (e.g., test inputs, expected outputs, predefined header files), and the solution of the exercise (typically source codes created by a student). There might be some specific requirements for the job, such as a specific runtime environment, specific version of a compiler or the job must be evaluated on a processor with a specific number of cores. The backend infrastructure decides whether it will accept a job or decline it based on the specified requirements. In case it accepts the job, it will be placed in a queue and processed as soon as possible. The backend publishes the progress of processing of the queued jobs and the results of the evaluations can be queried after the job processing is finished. The backend produces a log of the evaluation and scores the solution based on the job configuration document. *Frontend* on the other hand is responsible for the communication with the users and provides them a convenient access to the Backend infrastructure. The Frontend manages user accounts and gathers them into units called groups. There is a database of exercises which can be assigned to the groups and the users of these groups can submit their solutions for these assignments. The Frontend will initiate evaluation of these solutions by the Backend and it will store the results afterwards. The results will be visible to authorized users and the results will be awarded with points according to the score given by the Backend in the evaluation process. The supervisors of the groups can edit the parameters of the assignments, review the solutions and the evaluations in detail and award the solutions with bonus points (both positive and negative) and discuss about the solution with the author of the solution. Some of the users can be entitled to create new exercises and extend the database of exercises which can be assigned to the groups later on. The Frontend developed as part of this project was created with the needs of the Faculty of Mathematics and Physics of the Charles university in Prague in mind. The users are the students and their teachers, groups correspond to the different courses, the teachers are the supervisors of these groups. We believe that this model is applicable to the needs of other universities, schools, and IT companies, which can use the same system for their needs. It is also possible to develop their own frontend with their own user management system for their specific needs and use the possibilities of the Backend without any changes, as was mentioned in the previous paragraphs. In the latter parts of the documentation, both of the Backend and Frontend parts will be introduced separately and covered in more detail. The communication protocol between these two logical parts will be described as well. #### Evaluation unit executed on backend @todo: describe possibilities of "piece of work" which can backend execute, how they can look like, describe our job and its tasks @todo: why is there division to internal and external tasks and why it is needed @todo: in what order should tasks be executed, how to sort them @todo: how to solve problem with specific worker environment, mention internal job variables ### Implementation analysis Developing project like ReCodEx have to have some discussion over implementation details and how to solve some particular problems properly. This discussion is never ending story which is done through whole development process. Some of the most important implementation problems or interesting observations will be discussed in this chapter. #### General backend implementation There are numerous ways how to divide some sort of system into separated services, from one single component to many and many single-purpose components. Having only one big service is not feasible, not scalable enough and mainly it would be one big blob of code which somehow works and is very complex, so this is not the way. The quite opposite, having a lot of single-purpose components is also somehow impractical. It is scalable by default and all services would have quite simple code but on the other hand communication requirements for such solution would be insane. So there has to be chosen approach which is somehow in the middle, that means services have to communicate in manner which will not bring network down, code basis should be reasonable and the whole system has to be scalable enough. With this being said there can be discussion over particular division for ReCodEx system. From the scalable point of view there are two necessary components, the one which will execute jobs and component which will distribute jobs to the instances of the first one. This ensures scalability in manner of parallel execution of numerous jobs which is exactly what is needed. Implementation of these services are called 'broker' and 'worker', first one handles distribution, latter execution. These components should be enough to fulfil all above said, but for the sake of simplicity and better communication gateways with frontend two other components were added, 'fileserver' and 'monitor'. Fileserver is simple component whose purpose is to store files which are exchanged between frontend and backend. Monitor is also quite simple service which is able to serve job progress state from worker to web application. These two additional services are on the edge of frontend and backend (like gateways) but logically they are more connected with backend, so it is considered they belong there. @todo: what type of communication within backend could be used, mention some frameworks, queue managers, protocols, which was considered #### Fileserver @todo: fileserver and why is separated @todo: mention hashing on fileserver and why this approach was chosen @todo: what can be stored on fileserver @todo: how can jobs be stored on fileserver, mainly mention that it is nonsence to store inputs and outputs within job archive #### Broker @todo: assigning of jobs to workers, which are possible algorithms, queues, which one was chosen @todo: how can jobs be sent over zeromq, mainly mention that files can be transported, but it is not feasible @todo: making action and reaction over zeromq more general and easily extensible, mention reactor and why is needed and what it solves #### Worker @todo: worker and its internal structure, why there are two threads and what they can do, mention also multiplatform approach during development @todo: execution of job on worker, how it is done, what steps are necessary and general for all jobs @todo: how can inputs and outputs (and supplementary files) be handled (they can be downloaded on start of execution, or during...) @todo: caching of supplementary files (link to hashing above), describe cleaner and why it is a separate component @todo: describe a bit more cleaner functionality and that it is safe and there are no unrecoverable races @todo: sandboxing, what possibilites are out there (linux, Windows), what are general and really needed features, mention isolate, what are isolate features #### Monitor @todo: how progress status can be sent, why is there separate component of system (monitor) and why is this feature only optional @todo: monitor and what is done there, mention caching and why it is needed #### General frontend implementation @todo: communication between backend and frontend @todo: why is frontend divided into server and client part, mention possibilities of separated api (can be used by multiple client programs - mobile/pc/web applications) @todo: what apis can be used on server frontend side, why rest in particular #### API @todo: php frameworks, why nette @todo: what database can be used, how it is mapped and used within code @todo: authentication, some possibilities and describe used jwt @todo: solution of forgotten password, why this in particular @todo: rest api is used for report of backend state and errors, describe why and other possibilities (separate component) @todo: what files are stored in api, why there are duplicates among api and fileserver @todo: why are there instances and for which they can be used for, describe licences and its implementation @todo: groups and hierarchy, describe arbitrary nesting which should be possible within instance @todo: where is stored which workers can be used by supervisors and which runtimes are available, describe possibilities and why is not implemented automatic solution @todo: on demand loading of students submission, in-time loading of every other submission, why #### Web-app @todo: what technologies can be used on client frontend side, why react was used @todo: please think about more stuff about api and web-app... thanks ;-) The Backend =========== The backend is the part which is hidden to the user and which has only one purpose: evaluate user’s solutions of their assignments. @todo: describe the configuration inputs of the Backend @todo: describe the outputs of the Backend @todo: describe how the backend receives the inputs and how it communicates the results ## Components Whole backend is not just one service/component, it is quite complex system on its own. @todo: describe the inner parts of the Backend (and refer to the Wiki for the technical description of the components) ### Broker @todo: gets stuff done, single point of failure and center point of ReCodEx universe ### Fileserver @todo: stores particular datas from frontend and backend, hashing, HTTP API ### Worker @todo: describe a bit of internal structure in general @todo: describe how jobs are generally executed ### Monitor @todo: not necessary component which can be ommited, proxy-like service ## Backend internal communication @todo: internal backend communication, what communicates with what and why The Frontend ============ The frontend is the part which is visible to the user of ReCodEx and which holds the state of the system – the user accounts, their roles in the system, the database of exercises, the assignments of these exercises to groups of users (i.e., students), and the solutions and evaluations of them. Frontend is split into three parts: - the server-side REST API (“API”) which holds the business logic and keeps the state of the system consistent - the relational database (“DB”) which persists the state of the system - the client side application (“client”) which simplifies access to the API for the common users The centerpiece of this architecture is the API. This component receives requests from the users and from the Backend, validates them and modifies the state of the system and persists this modified state in the DB. We have created a web application which can communicate with the API server and present the information received from the server to the user in a convenient way. The client can be though any application, which can send HTTP requests and receive the HTTP responses. Users can use general applications like [cURL](https://github.com/curl/curl/), [Postman](https://www.getpostman.com/), or create their own specific client for ReCodEx API. Frontend capabilities --------------------- @todo: describe what the frontend is capable of and how it really works, what are the limitations and how it can be extended Terminology ----------- This project was created for the needs of a university and this fact is reflected into the terminology used throughout the Frontend. A list of important terms’ definitions follows to make the meaning unambiguous. ### User and user roles *User* is a person who uses the application. User is granted access to the application once he or she creates an account directly through the API or the web application. There are several types of user accounts depending on the set of permissions – a so called “role” – they have been granted. Each user receives only the most basic set of permissions after he or she creates an account and this role can be changed only by the administrators of the service: - *Student* is the most basic role. Student can become member of a group and submit his solutions to his assignments. - *Supervisor* can be entitled to manage a group of students. Supervisor can assign exercises to the students who are members of his groups and review their solutions submitted to these assignments. - *Super-admin* is a user with unlimited rights. This user can perform any action in the system. There are two implicit changes of roles: - Once a *student* is added to a group as its supervisor, his role is upgraded to a *supervisor* role. - Once a *supervisor* is removed from the lasts group where he is a supervisor then his role is downgraded to a *student* role. These mechanisms do not prevent a single user being a supervisor of one group and student of a different group as supervisors’ permissions are superset of students’ permissions. ### Login *Login* is a set of user’s credentials he must submit to verify he can be allowed to access the system as a specific user. We distinguish two types of logins: local and external. - *Local login* is user’s email address and a password he chooses during registration. - *External login* is a mapping of a user profile to an account of some authentication service (e.g., [CAS](https://ldap1.cuni.cz/)). ### Instance *An instance* of ReCodEx is in fact just a set of groups and user accounts. An instance should correspond to a real entity as a university, a high-school, an IT company or an HR agency. This approach enables the system to be shared by multiple independent organizations without interfering with each other. Usage of the system by the users of an instance can be limited by possessing a valid license. It is up to the administrators of the system to determine the conditions under which they will assign licenses to the instances. ### Group *Group* corresponds to a school class or some other unit which gathers users who will be assigned the same set exercises. Each group can have multiple supervisors who can manage the students and the list of assignments. Groups can form a tree hierarchy of arbitrary depth. This is inspired by the hierarchy of school classes belonging to the same subject over several school years. For example, there can be a top level group for a programming class that contains subgroups for every school year. These groups can then by divided into actual student groups with respect to lab attendance. Supervisors can create subgroups of their groups and further manage these subgroups. ### Exercise *An exercise* consists of textual assignment of a task and a definition of how a solution to this exercise should be processed and evaluated in a specific runtime environment (i.e., how to compile a submitted source code and how to test the correctness of the program). It is a template which can be instantiated as an *assignment* by a supervisor of a group. ### Assignment An assignment is an instance of an *exercise* assigned to a specific *group*. An assignment can modify the text of the task assignment and it has some additional information which is specific to the group (e.g., a deadline, the number of points gained for a correct solution, additional hints for the students in the assignment). The text of the assignment can be edited and supervisors can translate the assignment into another language. ### Solution *A solution* is a set of files which a user submits to a given *assignment*. ### Submission *A submission* corresponds to a *solution* being evaluated by the Backend. A single *solution* can be submitted repeatedly (e.g., when the Backend encounters an error or when the supervisor changes the assignment). ### Evaluation *An evaluation* is the processed report received from the Backend after a *submission* is processed. Evaluation contains points given to the user based on the quality of his solution measured by the Backend and the settings of the assignment. Supervisors can review the evaluation and add bonus points (both positive and negative) if the student deserves some. ### Runtime environment *A runtime environment* defines the used programming language or tools which are needed to process and evaluate a solution. Examples of a runtime environment can be: - *Linux + GCC* - *Linux + Mono* - *Windows + .NET 4* - *Bison + Yacc* ### Limits A correct *solution* of an *assignment* has to pass all specified tests (mostly checks that it yields the correct output for various inputs) and typically must also be effective in some sense. The Backend measures the time and memory consumption of the solution while running. This consumption of resources can be *limited* and the solution will receive fewer points if it exceeds the given limits in some test cases defined by the *exercise*. User management --------------- @todo: roles and their rights, adding/removing different users, how the role of a specific user changes Instances and hierarchy of groups --------------------------------- @todo: What is an instance, how to create one, what are the licenses and how do they work. Why can the groups form hierarchies and what are the benefits – what it means to be an admin of a group, hierarchy of roles in the group hierarchy. Exercises database ------------------ @todo: How the exercises are stored, accessed, who can edit what ### Creating a new exercise @todo Localized assignments, default settings ### Runtime environments and hardware groups @todo read this later and see if it still makes sense ReCodEx is designed to utilize a rather diverse set of workers -- there can be differences in many aspects, such as the actual hardware running the worker (which impacts the results of measuring) or installed compilers, interpreters and other tools needed for evaluation. To address these two examples in particular, we assign runtime environments and hardware groups to exercises. The purpose of runtime environments is to specify which tools (and often also operating system) are required to evaluate a solution of the exercise -- for example, a C# programming exercise can be evaluated on a Linux worker running Mono or a Windows worker with the .NET runtime. Such exercise would be assigned two runtime environments, `Linux+Mono` and `Windows+.NET` (the environment names are arbitrary strings configured by the administrator). A hardware group is a set of workers that run on similar hardware (e.g. a particular quad-core processor model and a SSD hard drive). Workers are assigned to these groups by the administrator. If this is done correctly, performance measurements of a submission should yield the same results. Thanks to this fact, we can use the same resource limits on every worker in a hardware group. However, limits can differ between runtime environments -- formally speaking, limits are a function of three arguments: an assignment, a hardware group and a runtime environment. ### Reference solutions @todo: how to add one, how to evaluate it The task of determining appropriate resource limits for exercises is difficult to do correctly. To aid exercise authors and group supervisors, ReCodEx supports assigning reference solutions to exercises. Those are example programs that should cover the main approaches to the implementation. For example, searching for an integer in an ordered array can be done with a linear search, or better, using a binary search. Reference solutions can be evaluated on demand, using a selected hardware group. The evaluation results are stored and can be used later to determine limits. In our example problem, we could configure the limits so that the linear search-based program doesn't finish in time on larger inputs, but a binary search does. Note that separate reference solutions should be supplied for all supported runtime environments. ### Exercise assignments @todo: Creating instances of an exercise for a specific group of users, capabilities of settings. Editing limits according to the reference solution. Evaluation process ------------------ @todo: How the evaluation process works on the Frontend side. ### Uploading files and file storage @todo: One by one upload endpoint. Explain different types of the Uploaded files. ### Automatic detection of the runtime environment @todo: Users must submit correctly named files – assuming the RTE from the extensions. REST API implementation ----------------------- @todo: What is the REST API, what are the basic principles – GET, POST, Headers, JSON. ### Authentication and authorization scopes @todo: How authentication works – signed JWT, headers, expiration, refreshing. Token scopes usage. ### HTTP requests handling @todo: Router and routes with specific HTTP methods, preflight, required headers ### HTTP responses format @todo: Describe the JSON structure convention of success and error responses ### Used technologies @todo: PHP7 – how it is used for typehints, Nette framework – how it is used for routing, Presenters actions endpoints, exceptions and ErrorPresenter, Doctrine 2 – database abstraction, entities and repositories + conventions, Communication over ZMQ – describe the problem with the extension and how we reported it and how to treat it in the future when the bug is solved. Relational database – we use MariaDB, Doctine enables us to switch the engine to a different engine if needed ### Data model @todo: Describe the code-first approach using the Doctrine entities, how the entities map onto the database schema (refer to the attached schemas of entities and relational database models), describe the logical grouping of entities and how they are related: - user + settings + logins + ACL - instance + licences + groups + group membership - exercise + assignments + localized assignments + runtime environments + hardware groups - submission + solution + reference solution + solution evaluation - comment threads + comments ### API endpoints @todo: Tell the user about the generated API reference and how the Swagger UI can be used to access the API directly. Web Application --------------- @todo: What is the purpose of the web application and how it interacts with the REST API. ### Used technologies @todo: Briefly introduce the used technologies like React, Redux and the build process. For further details refer to the GitHub wiki ### How to use the application @todo: Describe the user documentation and the FAQ page. Backend-Frontend communication protocol ======================================= @todo: describe the exact methods and respective commands for the communication Initiation of a job evaluation ------------------------------ @todo: How does the Frontend initiate the evaluation and how the Backend can accept it or decline it Job processing progress monitoring ---------------------------------- When evaluating a job the worker sends progress messages on predefined points of evaluation chain. The sending place can be on very beginning of the job, when submit archive is downloaded or at the end of each simple task with its state (completed, failed, skipped). These messages are sent to broker through existing ZeroMQ connection. Detailed format of messages can be found on [communication page](https://github.com/ReCodEx/wiki/wiki/Overall-architecture#commands-from-worker-to-broker). Broker only resends received progress messages to the monitor component via ZeroMQ socket. The output message format is the same as the input format. Monitor parses received messages to JSON format, which is easy to work with in JavaScript inside web application. All messages are cached (one queue per job) and can be obtained multiple times through WebSocket communication channel. The cache is cleared 5 minutes after receiving last message. Publishing of the results ------------------------- After job finish the worker packs results directory into single archive and uploads it to the fileserver through HTTP protocol. The target URL is obtained from API in headers on job initiation. Then "job done" notification request is performed to API via broker. Special submissions (reference or asynchronous submissions) are loaded immediately, other types are loaded on-demand on first results request. Loading results means fetching archive from fileserver, parsing the main YAML file generated by worker and saving data to the database. Also, points are assigned by score calculator. User documentation ================== Web Application --------------- @todo: Describe different scenarios of the usage of the Web App ### Terminology @todo: Describe the terminology: Instance, User, Group, Student, Supervisor, Admin ### Web application requirements @todo: Describe the requirements of running the web application (modern web browser, enabled CSS, JavaScript, Cookies & Local storage) ### Scenario \#1: Becoming a user of ReCodEx #### How to create a user account? You can create an account if you click on the “*Create account*” menu item in the left sidebar. You can choose between two types of registration methods – by creating a local account with a specific password, or pairing your new account with an existing CAS UK account. If you decide a new “*local*” account using the “*Create ReCodEx account*” form, you will have to provide your details and choose a password for your account. You will later sign in using your email address as your username and the password you select. If you decide to use the CAS UK, then we will verify your credentials and access your name and email stored in the system and create your account based on this information. You can change your personal information or email later on the “*Settings*” page. When crating your account both ways, you must select an instance your account will belong to by default. The instance you will select will be most likely your university or other organization you are a member of. #### How to get into ReCodEx? To log in, go to the homepage of ReCodEx and in the left sidebar choose the menu item “*Sign in*”. Then you must enter your credentials into one of the two forms – if you selected a password during registration, then you should sign with your email and password in the first form called “*Sign into ReCodEx*”. If you registered using the Charles University Authentication Service (CAS), you should put your student’s number and your CAS password into the second form called “Sign into ReCodEx using CAS UK”. #### How do I sign out of ReCodEx? If you don’t use ReCodEx for a whole day, you will be logged out automatically. However, we recommend you sign out of the application after you finished your interaction with it. The logout button is placed in the top section of the left sidebar right under your name. You will have to expand the sidebar with a button next to the “*ReCodEx*” title (shown in the picture below). @todo: Simon's image #### What to do when you cannot remember your password? If you can’t remember your password and you don’t use CAS UK authentication, then you can reset your password. You will find a link saying “*You cannot remember what your password was? Reset your password.*” under the sign in form. After you click on this link, you will be asked to submit your email address. An email with a link containing a special token will be sent to the address you fill in. We make sure that the person who requested password resetting is really you. When you click on the link (or you copy & paste it into your web browser) you will be able to select a new password for your account. The token is valid only for a couple of minutes, so do not forget to reset the password as soon as possible, or you will have to request a new link with a valid token. If you sign in through CAS UK, then please follow the instructions provided by the administrators of the service described on their website. #### How to configure your account? There are several options you have to edit your user account. - changing your personal information (i.e., name) - changing your credentials (email and password) - updating your preferences (e.g., source code viewer/editor settings, default language) You can access the settings page through the “*Settings*” button right under your name in the left sidebar. ### Scenario \#2: User is a student @todo: describe what it means to be a “student” and what are the student’s rights #### How to join a group for my class? @todo: How to join a specific group #### Which assignments do I have to solve? @todo: Where the student can find the list of the assignment he is expected to solve, what is the first and second deadline. #### Where can I see details of my classes’ group? @todo: Where can the user see groups description and details, what information is available. #### How to submit a solution of an assignment? @todo: How does a student submit his solution through the web app #### Where are the results of my solutions? @todo: When the results are ready and what the results mean and what to do about them, when the user is convinced, that his solution is correct although the results say different #### How can I discuss my solution with my teacher/group’s supervisor directly through the web application? @todo: Describe the comments thread behavior (public/private comments), who else can see the comments, how notifications work (*not implemented yet*!). ### Scenario \#3: User is supervisor of a group @todo: describe what it means to be a “supervisor” of a group and what are the supervisors rights #### How do I become a supervisor of a group? @todo: How does a user become a supervisor of a group? #### How to add or remove a student to my group? @todo: How to add a specific student to a given group #### How do I add another supervisor to my group? @todo: who can add another supervisor, what would be the rights of the second supervisor #### How do I create a subgroup of my group? @todo: What it means to create a subgroup and how to do it. #### How do I assign an exercise to my students? @todo: Describe how to access the database of the exercises and what are the possibilities of assignment setup – availability, deadlines, points, score configuration, limits #### How do I configure the limits of an assignment and how to choose appropriate limits? @todo: Describe the form and explain the concept of reference solutions. How to evaluate the reference solutions for the exercise right now (to get the up-to-date information). #### How can I assign some exercises only to some students of the group? @todo: Describe how to achieve this using subgroups #### How can I see my students’ solutions? @todo Describe where all the students’ solutions for a given assignment can be found, where to look for all solutions of a given student, how to see results of a specific student’s solution’s evaluation result. #### Can I assign points to my students’ solutions manually instead of depending on automatic scoring? @todo If and how to change the score of a solution – assignment settings, setting points, bonus points, accepting a solution (*not implemented yet!*). Describe how the student and supervisor will still be able to see the percentage received from the automatic scoring, but the awarded points will be overridden. #### How can I discuss student’s solution with him/her directly through the web application? @todo: Describe the comments thread behavior (public/private comments), who else can see the comments -- same as from the student perspective ### Writing job configuration To run and evaluate an exercise the backend needs to know the steps how to do that. This is different for each environment (operation system, programming language, etc.), so each of the environments needs to have separate configuration. Backend works with a powerful, but quite low level description of simple connected tasks written in YAML syntax. More about the syntax and general task overview can be found on [separate page](https://github.com/ReCodEx/wiki/wiki/Assignments). One of the planned features was user friendly configuration editor, but due to tight deadline and team composition it did not make it to the first release. However, writing configuration in the basic format will be always available and allows users to use the full expressive power of the system. This section walks through creation of job configuration for _hello world_ exercise. The goal is to compile file _source.c_ and check if it prints `Hello World!` to the standard output. This is the only test case, let's call it **A**. The problem can be split into several tasks: - compile _source.c_ into _helloworld_ with `/usr/bin/gcc` - run _helloworld_ and save standard output into _out.txt_ - fetch predefined output (suppose it is already uploaded to fileserver) with hash `a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b` to _reference.txt_ - compare _out.txt_ and _reference.txt_ by `/usr/bin/diff` The absolute path of tools can be obtained from system administrator. However, `/usr/bin/gcc` is location, where the GCC binary is available almost everywhere, so location of some tools can be (professionally) guessed. First, write header of the job to the configuration file. ```{.yml} submission: job-id: hello-word-job hw-groups: - group1 ``` Basically it means, that the job _hello-world-job_ needs to be run on workers that belong to the `group_1` hardware group . Reference files are downloaded from the default location configured in API (such as `http://localhost:9999/exercises`) if not stated explicitly otherwise. Job execution log will not be saved to result archive. Next the tasks have to be constructed under _tasks_ section. In this demo job, every task depends only on previous one. The first task has input file _source.c_ (if submitted by user) already available in working directory, so just call the GCC. Compilation is run in sandbox as any other external program and should have relaxed time and memory limits. In this scenario, worker defaults are used. If compilation fails, the whole job is immediately terminated (because the _fatal-failure_ bit is set). Because _bound-directories_ option in sandbox limits section is mostly shared between all tasks, it can be set in worker configuration instead of job configuration (suppose this for following tasks). For configuration of workers please contact your administrator. ```{.yml} - task-id: "compilation" type: "initiation" fatal-failure: true cmd: bin: "/usr/bin/gcc" args: - "source.c" - "-o" - "helloworld" sandbox: name: "isolate" limits: - hw-group-id: group1 chdir: ${EVAL_DIR} bound-directories: - src: ${SOURCE_DIR} dst: ${EVAL_DIR} mode: RW ``` The compiled program is executed with time and memory limit set and the standard output is redirected to a file. This task depends on _compilation_ task, because the program cannot be executed without being compiled first. It is important to mark this task with _execution_ type, so exceeded limits will be reported in frontend. Time and memory limits set directly for a task have higher priority than worker defaults. One important constraint is, that these limits cannot exceed limits set by workers. Worker defaults are present as a safety measure so that a malformed job configuration cannot block the worker forever. Worker default limits should be reasonably high, like a gigabyte of memory and several hours of execution time. For exact numbers please contact your administrator. It is important to know that if the output of a program (both standard and error) is redirected to a file, the sandbox disk quotas apply to that file, as well as the files created directly by the program. In case the outputs are ignored, they are redirected to `/dev/null`, which means there is no limit on the output length (as long as the printing fits in the time limit). ```{.yml} - task-id: "execution_1" test-id: "A" type: "execution" dependencies: - compilation cmd: bin: "helloworld" sandbox: name: "isolate" stdout: ${EVAL_DIR}/out.txt limits: - hw-group-id: group1 chdir: ${EVAL_DIR} time: 0.5 memory: 8192 ``` Fetch sample solution from fileserver. Base URL of fileserver is in the header of the job configuration, so only the name of required file (its `sha1sum` in our case) is necessary. ```{.yml} - task-id: "fetch_solution_1" test-id: "A" dependencies: - execution cmd: bin: "fetch" args: - "a0b65939670bc2c010f4d5d6a0b3e4e4590fb92b" - "${SOURCE_DIR}/reference.txt" ``` Comparison of results is quite straightforward. It is important to set the task type to _evaluation_, so that the return code is set to 0 if the program is correct and 1 otherwise. We do not set our own limits, so the default limits are used. ```{.yml} - task-id: "judge_1" test-id: "A" type: "evaluation" dependencies: - fetch_solution_1 cmd: bin: "/usr/bin/diff" args: - "out.txt" - "reference.txt" sandbox: name: "isolate" limits: - hw-group-id: group1 chdir: ${EVAL_DIR} ```