Fixes and introduction copied here

8 years ago · 8a81756da4
parent 3f7f7209bf
commit 8a81756da4
1 changed files with 169 additions and 12 deletions
--- a/Rewritten-docs.md
+++ b/Rewritten-docs.md
@ -1,13 +1,162 @@
 Introduction
 ============
-@todo: Describe who we are and what is the nature of the project.
+Generally, there are a lot of different ways and opinions on how to teach people
 something new. However, most people agree that a hands-on experience is one of
 the best ways to make the human brain remember a new skill. Learning must be
 entertaining and interactive, with fast and frequent feedback. Some kinds of
 knowledge are more suitable for this practical type of learning than others, and
 fortunately, programming is one of them.
 University education system is one of the areas where this knowledge can be
 applied. In computer programming, there are several requirements such as the
 code being syntactically correct, efficient and easy to read, maintain and
 extend. Correctness and efficiency can be tested automatically to help teachers
 save time for their research, but checking for bad design, habits and mistakes
 is really hard to automate and requires manpower.
 Checking programs written by students takes a lot of time and requires a lot of
 mechanical, repetitive work. The first idea of an automatic evaluation system
 comes from Stanford University profesors in 1965. They implemented a system
 which evaluated code in Algol submitted on punch cards. In following years, many
 similar products were written.
 There are two basic ways of automatically evaluating code -- statically (check
 the code without running it; safe, but not much precise) or dynamically (run the
 code on testing inputs with checking the outputs against reference ones; needs
 sandboxing, but provides good real world experience).
 This project focuses on the machine-controlled part of source code evaluation.
 First, problems of present software at our university were discussed and similar
 projects at other educational institutions were examined. With acquired
 knowledge from such projects in production, we set up goals for the new
 evaluation system, designed the architecture and implemented a fully operational
 solution. The system is now ready for production testing at our university.
 Analysis
 --------
-@todo: Describe how the idea of ReCodEx originated and how we came up
+### Current solution at MFF UK
-with the stuff we implemented.
+
 The ideas presented above are not completely new. There was a group of students, 
 who already implemented an evaluation solution for student's homeworks in 2006. 
 Its name is [CodEx - The Code Examiner](http://codex.ms.mff.cuni.cz/project/) 
 and it has been used with some improvements since then. The original plan was to 
 use the system only for basic programming courses, but there is demand for 
 adapting it for many different subjects.
 CodEx is based on dynamic analysis. It features a web-based interface, where 
 supervisors assign exercises to their students and the students have a time 
 window to submit the solution. Each solution is compiled and run in sandbox 
 (MO-Eval). The metrics which are checked are: corectness of the output, time and 
 memory limits. It supports programs written in C, C++, C#, Java, Pascal, Python 
 and Haskell.
 Current system is old, but robust. There were no major security incidents during its production usage. However, from today's perspective there are several drawbacks. The main ones are:
 - **web interface** -- The web interface is simple and fully functional. But 
  rapid development in web technologies opens new horizons of how web interface 
  can be made.
 - **web api** -- CodEx offers a very limited XML API based on outdated 
  technologies that is not sufficient for users who would like to create custom 
  interfaces such as a command line tool or mobile application.
 - **sandboxing** -- MO-Eval sandbox is based on principle of monitoring system 
  calls and blocking the bad ones. This can be easily done for single-threaded 
  applications, but proves difficult with multi-threaded ones. In present day,
  parallelism is a very important area of computing, so there is requirement to 
  test multi-threaded applications too.
 - **instances** -- Different ways of CodEx usage scenarios requires separate 
  instances (Programming I and II, Java, C#, etc.). This configuration is not 
  user friendly (students have to register in each instance separately) and 
  burdens administrators with unnecessary work. CodEx architecture does not 
  allow sharing hardware between instances, which results in an inefficient use 
  of hardware for evaluation.
 - **task extensibility** -- There is a need to test and evaluate complicated 
  programs for classes such as Parallel programming or Compiler principles, 
  which have a more difficult evaluation chain than simple 
  compilation/execution/evaluation provided by CodEx.
 After considerring all these facts, it is clear that CodEx cannot be used 
 anymore. The project is too old to just maintain it and extend for modern 
 technologies. Thus, it needs to be completely rewritten or another solution must 
 be found.
 ### Related projects
 First of all, some code evaluating projects were found and examined. It is not a complete list of such evaluators, but just a few projects which are used these days and can be an inspiration for our project.
 #### Progtest
 [Progtest](https://progtest.fit.cvut.cz/) is private project from FIT ČVUT in 
 Prague. As far as we know it is used for C/C++, Bash programming and 
 knowledge-based quizzes. There are several bonus points and penalties and also a 
 few hints what is failing in submitted solution. It is very strict on source 
 code quality, for example `-pedantic` option of GCC, Valgrind for memory leaks 
 or array boundaries checks via `mudflap` library.
 #### Codility
 [Codility](https://codility.com/) is web based solution primary targeted to company recruiters. It is commercial product of SaaS type supporting 16 programming languages. The [UI](http://1.bp.blogspot.com/-_isqWtuEvvY/U8_SbkUMP-I/AAAAAAAAAL0/Hup_amNYU2s/s1600/cui.png) of Codility is [opensource](https://github.com/Codility/cui), the rest of source code is not available. One interesting feature is 'task timeline' -- captured progress of writing code for each user.
 #### CMS
 [CMS](http://cms-dev.github.io/index.html) is an opensource distributed system 
 for running and organizing programming contests. It is written in Python and 
 contain several modules. CMS supports C/C++, Pascal, Python, PHP and Java. 
 PostgreSQL is a single point of failure, all modules heavily depend on database 
 connection. Task evaluation can be only three step pipeline -- compilation, 
 execution, evaluation. Execution is performed in 
 [Isolate](https://github.com/ioi/isolate), sandbox written by consultant of our 
 project, Mgr. Martin Mareš, Ph.D.
 #### MOE
 [MOE](http://www.ucw.cz/moe/) is a grading system written in Shell scripts, C 
 and Python. It does not provide a default GUI interface, all actions have to be 
 performed from command line. The system does not evaluate submissions in real 
 time, results are computed in batch mode after exercise deadline, using Isolate 
 for sandboxing. Parts of MOE are used in other systems like CodEx or CMS, but 
 the system is generally obsolete.
 #### Kattis
 [Kattis](http://www.kattis.com/) is another SaaS solution. It provides a clean 
 and functional web UI, but the rest of the application is too simple. A nice 
 feature is the usage of a [standardized 
 format](http://www.problemarchive.org/wiki/index.php/Problem_Format) for 
 exercises. Kattis is primarily used by programming contest organizators, company 
 recruiters and also some universities.
 ### ReCodEx goals
 From the research above, we set up several goals, which a new system should
 have. They mostly reflect drawbacks of current version of CodEx. No existing
 tool fits our needs, for example no examined project provides complex
 execution/evaluation pipeline to support needs of courses like Compiler
 principles. Modifying CodEx is also not an option -- the required scope of a new
 solution is too big. To sum up, a new evaluation system has to be written, with
 only small parts of reused code from CodEx (for example judges).
 The new project is **ReCodEx -- ReCodEx Code Examiner**. The name should point
 to CodEx, previous evaluation solution, but also reflect new approach to solve
 issues. **Re** as part of the name means redesigned, rewritten, renewed or
 restarted.
 Official assignment of the project is available at [web of software project
 committee](http://www.ksi.mff.cuni.cz/sw-projekty/zadani/recodex.pdf) (only in
 Czech). Most notable features are following:
 - modern HTML5 web frontend written in Javascript using a suitable framework
 - REST API implemented in PHP, communicating with database, backend and file
  server
 - backend is implemented as distributed system on top of message queue framework
  (ZeroMQ) with master-worker architecture
 - worker with basic support of Windows environment (without sandbox, no general
  purpose suitable tool available yet)
 - evaluation procedure configured in YAML file, compound of small tasks
  connected into arbitrary oriented acyclic graph
 Structure of the project
 ------------------------
@ -669,14 +818,15 @@ First, write header of the job to the configuration file.
 ```{.yml}
 submission:
    job-id: hello-word-job
    file-collector: http://localhost:9999/exercises
    hw-groups:
        - group1
 ```
-Basically it means, that the job _hello-world-job_ is for C language and needs
+Basically it means, that the job _hello-world-job_ needs to be run on workers
-to be run on workers with capabilities of _group1_ group. Reference files are
+with capabilities of _group1_ group. Reference files are downloaded from the
-downloaded from http://localhost:9999/exercises.
+default location configured in API (probably `http://localhost:9999/exercises`)
 if not stated explicitly otherwise. Job execution log will not be saved to
 result archive.
 Next the tasks have to be constructed under _tasks_ section. In this demo job,
 every task depends only on previous one. The first task has input file
@ -716,11 +866,18 @@ the program cannot be executed without being compiled first. It is important to
 mark this task with _execution_ type, so exceeded limits will be reported in
 frontend.
-@todo describe overriding of default (per worker) limits and that we cannot 
+Time and memory limits set directly for a task have higher priority than worker
-relax them
+defaults. One important constraint is, that these limits cannot exceed limits
-@todo state clearly that the limits on stdout, stderr and file IO are shared 
+set by workers. Worker defaults are present as a safety for the sake of
-(but if outputs are ignored (not redirected to a file), the program can use them 
+possibility that wrong job configuration can block whole worker forever.  Worker
-at will)
+default limits should be set reasonably high, like gigabyte of memory and couple
 hours of execution time. For exact numbers please contact your administrator.
 It is good point to remind here, that if output of a program (both standard and
 error) is redirected to a file, the sandbox disk quotas holds for these files as
 well as for files created directly by the program. But if outputs are ignored,
 they are redirected to `/dev/null` file where arbitrary amount of data can be
 written.
 ```{.yml}
 - task-id: "execution_1"