You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

9.3 KiB

Overall Architecture

Overall Architecture

ReCodEx is designed to be very modular. WebApp + File Server are one instance of the application. They contain almost all logic of the app including user management and authentication, storing and versioning tasks, counting and assigning points to users etc. One instance of the app can be connected to one or more Workers and one Worker can be connected to more instances of the WebApp. Worker is connected with WebApp through messaging queue.

Worker

Worker Architecture

Worker's main role is securely compile, run and evaluate given submit against model solutions provided by author of each task. It is logicaly divided into three objects:

  • Message Frontend communicates with WebApp using messaging queue ZeroMQ. It receives new submits, operates the evaluation through Work API and reports progress back.
  • Worker Core can do all evaluating steps and is responsible for security of them. Sandbox Isolate is used.
  • File Server Frontend provides access to files on File Server via File API, where testing inputs and corresponding outputs for each task and other required files are stored. It's possible to upload files, too.

Default worker configuration

Worker should have some default configuration which is applied to worker itself or may be used in given jobs (implicitly if something is missing, or explicitly with special variables). This configuration should be hardcoded and can be rewritten by explicitly declared configuration file. Format of this configuration is yaml like in the job config.

---  # only one document with all configuration needed
worker-id: 1
broker-uri: tcp://localhost:9657
headers:
    env:
        - c
        - python
    threads: 2
    hwgroup: "group1"
working-directory: /tmp/working_dir  # if not set then default is: __TEMP__/isoeval
file-managers:
    - hostname: "http://localhost:80"  # port is optional
      username: "654321"  # can be ignored in specific modules
      password: "123456"  # can be ignored in specific modules
      cache:  # only in case that there is cache module
          cache-dir: "/tmp/isoeval/cache"
    - hostname: http://localhost:4242
      username: 123456
      password: 654321
      cache:
          cache-dir: /tmp/isoeval/cache
logger:
    file: "/tmp/isoeval/log/worker"  # w/o suffix - actual names will be worker.log, worker.1.log, ...
    level: "debug"  # level of logging - one of "debug", "info", "notice", "warn", "err", "critical", "alert", "emerg", "off"
    max-size: 1048576  # 1 MB; max size of file before log rotation
    rotations: 3  # number of rotations kept
limits:
    time: 5  # in secs
    wall-time: 6  # seconds
    extra-time: 2  # seconds
    stack-size: 50000  # KB
    memory: 50000  # in KB
    parallel: false  # time and memory limits are merged
    disk-blocks: 50
    disk-inodes: 5
    bound-directories:
        /tmp/isoeval/1: /eval
...

Internal Worker architecture

Picture below is overall internal architecture of worker, which shows its defined classes with private variables and public functions. Vector version of this picture is available here. Internal Worker architecture

File Server

File Server Infrastructure

File Server stores data, that should be kept outside of WebApp's database (both because storing files in a database is inefficient and because the workers need to access the files in the simplest possible way). It should meet following requirements:

  • store files without duplicates
  • keep consistent state with main database
  • serve files to workers on demand
  • allow versioning of tasks with revert back feature

To meet these requirements, Storage and Database must be set as bellow.

Storage

Storage is meant as disc space with some commonly used filesystem. We'll use ext4, but the other ones should work too. Storage file structure is:

.
├── submits
│   └── user_id
│       └── advanced_dot_net_1
│           └── submit_id
│               ├── eval.yml
│               └── source.cs
├── submit_archives
│   └── submit_id.tar.gz
├── tasks
│   ├── a
│   │   ├── a014ed2abb56371bfaf2b4298a85d5dfb56509ed
│   │   └── a5edbd8b12e670ed1e3110d6c0524000cd4c3c7a
│   └── b
│       └── b1696358b8540923eb79b68f95c0f94c13a83fa7
└── temp
    └── 1795184136b8bdddabe50453cc2cc2d46f0f7c5e
  • submits keep information about all files submited by users to ReCodEx. There are subdirectories user_id and advanced_dot_net_1 which groups submits by users and courses the submits are for. This structure is easy to maintain for new and deleted users.
  • submit_archives contains the student submissions in compressed archives so that they can be easily downloaded by workers.
  • tasks contains supplementary files (such as test inputs or helper programs) for all existing task in ReCodEx. To avoid too many files in one directory, files are separated to subfolders by first character of their name.
  • temp directory is dedicated to temporary storing outputs of programs on teachers' demand. This directory will be erased by cron job on daily basis.

Database

For user friendly access and modifying tasks following information should be stored in database:

  • list of tasks with their newest version number
  • for every task and version list of used files (their hashed names)
  • for every hash name one human readable filename

Conclusion

Files are internally stored by their sha1sum hashes, so it's easy to implement versioning and get rid of files with duplicate content (multiple files can have the same content, which is only stored once). Worker also uses files by their hashes, which is great for local caching without worries about actual version number of given file. On the other hand, Database stores information about human readable names, so that the files are presented in a friendly way to users (teachers) in WebApp.

Frontend - broker communication

The communication between the frontend and the workers is mediated by a broker that passes jobs to workers capable of processing them.

Assignment evaluation request

The frontend must send a multipart message that contains the following frames:

  • The eval command
  • The job id (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
  • A frame for each header (e.g. hwgroup=group_1)
  • An URL of the archive that contains the submitted files and isoeval configuration
  • An URL where the worker should store the result of the evaluation

If the broker is capable of routing the request to a worker, it responds with accept. Otherwise (for example when the requirements specified by the headers cannot be met), it responds with reject.

Note that we will need to store the job ID and the assignment configuration somewhere close to the submitted files so it's possible to check how a submission was evaluated. The job ID will likely be a part of the submission's path. The configuration could be linked there under some well-known name.

Notifying the frontend about evaluation progress

The script that requested the evaluation will have exited by the time a worker processes the request. This issue remains to be resolved.

Broker - worker communication

When a worker is started, it registers itself with the broker by sending the init command followed by headers that describe its capabilities (such as the number of threads it can run simultaneously, its hardware group, languages it can work with...).

Whenever the broker receives an assignment suitable for the worker, it just forwards the evaluation request message it originally received from the frontend. The worker has to:

  • Download the archive containing the submission and an isoeval configuration file
  • Download any supplementary files based on the configuration file, such as test inputs or helper programs (This can be done on demand, using a fetch command in the assignment configuration)
  • Download the source codes of the student's submission
  • Evaluate the submission according to the assignment's configuration
  • Upload the results of the evaluation to the file server
  • Notify the broker that the evaluation is finished

Thanks to this message structure, it's possible to cache the configuration file and only download the student's submissions when the same assignment is evaluated repeatedly for different students (a common case for homeworks and classroom assignments).

After finishing the evaluation, worker notifies the broker of this fact by sending:

  • The done command
  • The job id

This allows the broker to reliably distribute messages - if a worker doesn't succeed in processing a request (it doesn't respond in a time limit), the request can be sent to another worker.