You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
recodex-wiki/Overall-architecture.md

47 KiB

Overall Architecture

Description

ReCodEx is designed to be very modular and configurable. One such configuration is sketched in the following picture. There are two separate frontend instances with distinct databases sharing common backend part. This configuration may be suitable for MFF UK -- basic programming course and KSP competition. Note, that connections between components are not fully accurate.

Overall Architecture

Web app is main part of whole project from user point of view. It provides nice user interface and it's the only part, that interacts with outside world directly. Web API contains almost all logic of the app including user management and authentication, storing and versioning files (with help of File server), counting and assigning points to users etc. Advanced users may connect to the API directly or may create custom fronends. Broker is essential part of whole architecture. It maintains list of available Workers, receives submissions from the Web API and routes them further and reports progress of evaluations back to the Web app. Worker securely runs each received job and evaluate it's results. Monitor resends evaluation progress messages to the Web app in order to be presented to users.

Communication

Detailed communication inside the ReCodEx project is captured in the following image and described in sections below. Red connections are through ZeroMQ sockets, blue are through WebSockets and green are through HTTP(S). All ZeroMQ messages are sent as multipart with one string (command, option) per part, with no empty frames (unles explicitly specified otherwise).

Communication Img

Broker - Worker communication

Broker acts as server when communicating with worker. Listening IP address and port are configurable, protocol family is TCP. Worker socket is of DEALER type, broker one is ROUTER type. Because of that, very first part of every (multipart) message from broker to worker must be target worker's socket identity (which is saved on it's init command).

Commands from broker to worker:

  • eval - evaluate a job. Requires 3 message frames:
    • job_id - identifier of the job (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
    • job_url - URL of the archive with job configuration and submitted source code
    • result_url - URL where the results should be stored after evaluation
  • intro - introduce yourself to the broker (with init command) - this is required when the broker loses track of the worker who sent the command. Possible reasons for such event are e.g. that one of the communicating sides shut down and restarted without the other side noticing.
  • pong - reply to ping command, no arguments

Commands from worker to broker:

  • init - introduce self to the broker. Useful on startup or after reestablishing lost connection. Requires at least 2 arguments:

    • hwgroup - hardware group of this worker
    • header - additional header describing worker capabilities. Format must be header_name=value, every header shall be in a separate message frame. There is no limit on number of headers.

    There is also an optional third argument - additional information. If present, it should be separated from the headers with an empty frame. The format is the same as headers. Supported keys for additional information are:

    • description - a human readable description of the worker for administrators (it will show up in broker logs)
    • current_job - an identifier of a job the worker is now processing. This is useful when we're reassembling a connection to the broker and need it to know the worker won't accept a new job.
  • done - notifying of finished job. Contains following message frames:

    • job_id - identifier of finished job
    • result - response result, possible values are:
      • OK - evaluation finished successfully
      • FAILED - job failed and cannot be reassigned to another worker (e.g. due to error in configuration)
      • INTERNAL_ERROR - job failed due to internal worker error, but another worker might be able to process it (e.g. downloading a file failed)
    • message - a human readable error message
  • progress - notice about current evaluation progress. Contains following message frames:

    • job_id - identifier of current job
    • state - what is happening now.
      • DOWNLOADED - submission successfuly fetched from fileserver
      • FAILED - something bad happened and job was not executed at all
      • UPLOADED - results are uploaded to fileserver
      • STARTED - evaluation of tasks started
      • ENDED - evaluation of tasks is finnished
      • ABORTED - evaluation of job encountered internal error, job will be rescheduled to another worker
      • FINISHED - whole execution is finished and worker ready for another job execution
      • TASK - task state changed - see below
    • task_id - only present for "TASK" state - identifier of task in current job
    • task_state - only present for "TASK" state - result of task evaluation. One of:
      • COMPLETED - task was successfully executed without any error, subsequent task will be executed
      • FAILED - task ended up with some error, subsequent task will be skipped
      • SKIPPED - some of the previous dependencies failed to execute, so this task won't be executed at all
  • ping - tell broker I'm alive, no arguments

Heartbeating

It is important for the broker and workers to know if the other side is still working (and connected). This is achieved with a simple heartbeating protocol.

The protocol requires the workers to send a ping command regularly (the interval is configurable on both sides - future releases might let the worker send its ping interval with the init command). Upon receiving a ping command, the broker responds with pong.

Both sides keep track of missing heartbeating messages since the last one was received. When this number reaches a threshold (called maximum liveness), the other side is considered dead.

When the broker decides a worker died, it tries to reschedule its jobs to other workers.

If a worker thinks the broker is dead, it tries to reconnect with a bounded, exponentially increasing delay.

This protocol proved great robustness in real world testing. Thus whole backend is really reliable and can outlive short term issues with connection without problems. Also, increasing delay of ping messages doesn't flood the network when there are problems. We experienced no issues since we're using this protocol.

Worker - File Server communication

Worker is communicating with file server only from execution thread (see picture above). Supported protocol is HTTP optionally with SSL encryption (recommended, you can get free trusted DV certificate from Let's Encrypt authority if you haven't one yet). If supported by server and used version of libcurl, HTTP/2 standard is also available. File server should be set up to require basic HTTP authentication and worker is capable to send corresponding credentials with each request.

Worker side

Worker is cabable of 2 things - download file and upload file. Internally, worker is using libcurl C library with very similar setup. In both cases it can verify HTTPS certificate (on Linux against system cert list, on Windows against downloaded one from CURL website during installation), support basic HTTP authentication, offer HTTP/2 with fallback to HTTP/1.1 and fail on error (returned HTTP status code is >= 400). Worker have list of credentials to all available file servers in it's config file.

  • download file - standard HTTP GET request to given URL expectingi file content as response
  • upload file - standard HTTP PUT request to given URL with file data as body - same as command line tool curl with option --upload-file

File server side

File server has it's own internal directory structure, where all the files are stored. It provides simple REST API to get them or create new ones. File server doesn't provide authentication or secured connection by itself, but it's supposed to run file server as WSGI script inside a web server (like Apache) with proper configuration. Relevant commands for communication with workers:

  • GET /submission_archives/<id>.<ext> - gets an archive with submitted source code and corresponding configuration of this job evaluation
  • GET /tasks/<hash> - gets a file, common usage is for input files or reference result files
  • PUT /results/<id>.<ext> - upload archive with evaluation results under specified name (should be same id as name of submission archive). On successful upload returns JSON { "result": "OK" } as body of returned page.

If not specified otherwise, zip format of archives is used. Symbol / in API description is root of file server's domain. If the domain is for example fs.recodex.org with SSL support, getting input file for one task could look as GET request to https://fs.recodex.org/tasks/8b31e12787bdae1b5766ebb8534b0adc10a1c34c.

Broker - Monitor communication

Broker communicates with monitor also through ZeroMQ over TCP protocol. Type of socket is same on both sides, ROUTER. Monitor is set to act as server in this communication, its IP address and port are configurable in monitor's config file. ZeroMQ socket ID (set on monitor's side) is "recodex-monitor" and must be sent as first frame of every multipart message - see ZeroMQ ROUTER socket documentation for more info.

Note that the monitor is designed so that it can receive data both from the broker and workers. The current architecture prefers the broker to do all the communication so that the workers don't have to know too many network services.

Monitor is treated as a somewhat optional part of whole solution, so no special effort on communication realibility was made.

Commands from monitor to broker:

Because there is no need for the monitor to communicate with the broker, there are no commands so far. Any message from monitor to broker is logged and discarded.

Commands from broker to monitor:

  • progress - notification about progress with job evaluation. See Progress callback section for more info.

Broker - Web API communication

Broker communicates with main REST API through ZeroMQ connection over TCP. Socket type on broker side is ROUTER, on frontend part it's REQ. Broker acts as a server, its IP address and port is configurable in the API.

Commands from API to broker:

  • eval - evaluate a job. Requires at least 4 frames:
    • job_id - identifier of this job (in ASCII representation -- we avoid endianness issues and also support alphabetic ids)
    • header - additional header describing worker capabilities. Format must be header_name=value, every header shall be in a separate message frame. There is no maximum limit on number of headers. There may be also no headers at all.
    • empty frame (with empty string)
    • job_url - URI location of archive with job configuration and submitted source code
    • result_url - remote URI where results will be pushed to

Commands from broker to API (all are responses to eval command):

  • ack - this is first message which is sent back to frontend right after eval command arrives, basically it means "Hi, I am all right and am capable of receiving job requests", after sending this broker will try to find acceptable worker for arrived request

  • accept - broker is capable of routing request to a worker

  • reject - broker can't handle this job (for example when the requirements specified by the headers cannot be met). There are (rare) cases when the broker finds that it cannot handle the job after it's been confirmed. In such cases it uses the frontend REST API to mark the job as failed.

File Server - Web API communication

File server has a REST API for interaction with other parts of ReCodEx. Description of communication with workers is in File server side section. On top of that, there are other commands for interaction with the API:

  • GET /results/<id>.<ext> - download archive with evaluated results of job id
  • POST /submissions/<id> - upload new submission with identifier id. Expects that the body of the POST request uses file paths as keys and the content of the files as values. On successful upload returns JSON { "archive_path": <archive_url>, "result_path": <result_url> } in response body. From archive_path the submission can be downloaded (by worker) and corresponding evaluation results should be uploaded to result_path.
  • POST /tasks - upload new files, which will be available by names eqal to sha1sum of their content. There can be uploaded more files at once. On successful upload returns JSON { "result": "OK", "files": <file_list> } in response body, where file_list is dictionary of original file name as key and new URL with already hashed name as value.

There are no plans yet to support deleting files from this API. This may change in time.

Web API calls these fileserver endpoints with standard HTTP requests. There are no special commands involved. There is no communication in opposite direction.

Monitor - Web app communication

Monitor interacts with web application through WebSocket connection. Monitor acts as server and browsers are connecting to it. IP address and port are configurable. When client connects to the monitor, it sends a message with string representation of channel id (which messages are interested in, usually id of evaluating job). There can be multiple listeners per channel, even (shortly) delayed connections will receive all messages from the very begining.

When monitor receives progress message from broker there are two options:

  • there is no WebSocket connection for listed channel (job id) - message is dropped
  • there is active WebSocket connection for listed channel - message is parsed into JSON format (see below) and send as string to that established channel. Messages for active connections are queued, so no messages are discarded even on heavy workload.

Message JSON format is dictionary (associative array) with keys:

  • command - type of progress, one of:
    • DOWNLOADED - submission successfuly fetched from fileserver
    • FAILED - something bad happened and job was not executed at all
    • UPLOADED - results are uploaded to fileserver
    • STARTED - evaluation of tasks started
    • ENDED - evaluation of tasks is finnished
    • ABORTED - evaluation of job encountered internal error, job will be rescheduled to another worker
    • FINISHED - whole execution is finished and worker ready for another job execution
    • TASK - task state changed - see below
  • task_id - id of currently evaluated task. Present only if command is "TASK".
  • task_state - state of task with id task_id. Present only if command is "TASK". Value is one of "COMPLETED", "FAILED" and "SKIPPED".
    • COMPLETED - task was successfully executed without any error, subsequent task will be executed
    • FAILED - task ended up with some error, subsequent task will be skipped
    • SKIPPED - some of the previous dependencies failed to execute, so this task wont be executed at all

Web app - Web API communication

Provided web application runs as javascript client inside user's browser. It communicates with REST API on the server through standard HTTP requests. Documentation of the main REST API is in separate document due to it's extensiveness. Results are returned as JSON payload, which is simply parsed in web application and presented to the users.

Assignments

Assignments are programming tasks that can be tested and evaluated by a worker after user submits his solution. An assignment is described by a YAML file that contains information on how to build, run and test it. One submitted assignment is called a (worker) job.

Basics

Job is a set/list of tasks (it is generally a set, but order of tasks have some meaning). These tasks may have dependencies (arbitrary number), which needs to be observed. When worker processes a job, it creates a task graph, where tasks are vertices and dependencies are edges (A -> B means that the task A is on the dependency list of task B, so A must be run earlier) and creates its linear ordering. The graph must be acyclic (otherwise linear ordering will not exist) and the worker attempts to execute maximal number of tasks possible. Tasks without dependencies can be executed directly, other tasks are executed when all their dependencies have been successfully completed.

Tasks are executed sequentially -- by the linear ordering of the task graph. Parallel tasks (tasks, which are not directly dependent and thus their linear ordering may be arbitrary) are ordered first by their priority (higher number means higher priority) and secondly by their order in the configuration file. Priority is important for specifying evaluation flow. See sample picture for better understanding.

Picture of task serialization

Each task has a unique ID (alphanum string like CompileA, RunAA, or JudgeAB in the picture). These IDs are used to identify tasks (for dependency references, in the log, ...). Numbers in bottom right corner are priorities of each task. Higher number is greater priority. It means, that if task RunAA is done, next must be JudgeAA and not RunAB (that will be also valid linear ordering, but RunAB has lower priority).

Task

Task is an atomic piece of work executed by recodex-worker. There are two basic types of tasks:

  • Execute external process (optionally inside Isolate). External processes are meant for compilation, testing, or execution of external judges .Linux default is mandatory usage of isolate sandbox, this option is present because of Windows, where is currently no sandbox available.
  • Perform internal operation. Internal operations comprise commands, which are typically related to file/directory maintenance and other evaluation management stuff. Few important examples:
    • Create/delete/move/rename file/directory
    • (un)zip/tar/gzip/bzip file(s)
    • fetch a file from the file repository (either from worker cache or download it by HTTP GET or through SFTP).

Even though the internal operations may be handled by external executables (mv, tar, pkzip, wget, ...), it might be better to keep them inside the worker as it would simplify these operations and their portability among platforms. Furthermore, it is quite easy to implement them using common libraries (e.g., zlib, curl).

Internal tasks

  • Archivate task can be used for pack and compress a directory. Calling command is archivate. Requires two arguments:

    • path and name of the directory to be archived
    • path and name of the target archive. Only .zip format is supported.
  • Extract task is opposite to archivate task. It can extract different types of archives. Supported formats are the same as supports libarchive library (see libarchive wiki), mainly zip, tar, tar.gz, tar.bz2 and 7zip. Please note, that system administrator may not install all packages needed, so some formats may not work. Please, consult your system administrator for more information. Archives could contain only regular files or directories (ie. no symlinks, block and character devices sockets or pipes allowed). Calling command is extract and requires two arguments:

    • path and name of the archive to extract
    • directory, where the archive will be extracted
  • Fetch task will give you a file. It can be downloaded from remote file server or just copied from local cache if available. Calling comand is fetch with two arguments:

    • name of the requested file without path (file sources are set up in worker configuratin file)
    • path and name on the destination. Providing a different destination name can be used for easy rename.
  • Copy task can copy files and directories. Detailed info can be found on reference page of boost::filesystem::copy. Calling command is cp and require two arguments:

    • path and name of source target
    • path and name of destination targer
  • Make directory task can create arbitrary number of directories. Calling command is mkdir and requires at least one argument. For each provided argument will be called boost::filesystem::create_directories command.

  • Rename task will rename files and directories. Detailed bahavior can be found on reference page of boost::filesystem::rename. Calling command is rename and require two arguments:

    • path and name of source target
    • path and name of destination target
  • Remove task is for deleting files and directories. Calling command is rm and require at least one argument. For each provided one will be called boost::filesystem::remove_all command.

External tasks

External tasks are arbitrary executables, typically ran inside isolate (with given parameters) and the worker waits until they finish. The exit code determines, whether the task succeeded (0) or failed (anything else). A task may be marked as essential; in such case, failure will immediately cause termination of the whole job.

  • stdin - can be configured to read from existing file or from /dev/null.
  • stdout and stderr - can be individually redirected to a file or discarded. If this output options are specified, than it is possible to upload output files with results by copying them in result directory.
  • limits - task have time and memory limits; if these limits are exceeded, the task also fails.

The task results (exit code, time, and memory consumption, etc.) are saved into result yaml file and sent back to frontend application to address which was specified on input.

Judges

Judges are treated as normal external commands, so there is no special task type for them. Binaries are installed alongside with worker executable in standard directories (on both Linux and Windows systems).

Judges should be used for comparision of outputted files from execution tasks and sample outputs fetched from fileserver. Results of this comparision should be at least information if files are same or not. Extension for this is percentual results based on similarity of given files. All of the judges results have to be printed to standard output.

All packed judges are adopted from old Codex with only very small modifications. ReCodEx judges base directory is in ${JUDGES_DIR} variable, which can be used in job config file.

Judges interface

For future extensibility is critical that judges have some shared interface of calling and return values.

  • Parameters: There are two mandatory positional parameters which has to be files for comparision
  • Results:
    • comparison OK
      • exitcode: 0
      • stdout: there is one line with double value which should be percentage of similarity of two given files
    • error during execution
      • exitcode: 1
      • stderr: there should be description of error
ReCodEx judges

Below is list of judges which are packed with ReCodEx project and comply above requirements.

  • recodex-judge-normal is base judge used by most of exercises. This judge compares two text files. It compares only text tokens regardless on amount of whitespace between them.

    Usage: recodex-judge-normal [-r | -n | -rn] <file1> <file2>
    
    • file1 and file2 are paths to files that will be compared
    • switch options -r and -n can be specified as a 1st optional argument.
      • -n judge will treat newlines as ordinary whitespace (it will ignore line breaking)
      • -r judge will treat tokens as real numbers and compares them accordingly (with some amount of error)
  • recodex-judge-filter can be used for preprocess output files before real judging. This judge filters C-like comments from a text file. The comment starts with double slash sequence (//) and finishes with newline. If the comment takes whole line, then whole line is filtered.

    Usage: recodex-judge-filter [inputFile [outputFile]]
    
    • if outputFile is ommited, std. output is used instead.
    • if both files are ommited, application uses std. input and output.
  • recodex-judge-shuffle is for judging shuffled files. This judge compares two text files and returns 0 if they matches (and 1 otherwise). Two files are compared with no regards for whitespace (whitespace acts just like token delimiter).

    Usage: recodex-judge-shuffle [-[n][i][r]] <file1> <file2>
    
    • -n ignore newlines (newline is considered only a whitespace)
    • -i ignore items order on the row (tokens on each row may be permutated)
    • -r ignore order of rows (rows may be permutated); this option has no effect when -n is used

Job configuration

Configuration of the job which is passed to worker is generated on demand by web API. Each job has unique one.

Configuration items

Here is the list with description of allowed options. Mandatory items are bold, optional italic.

  • submission - information about this particular submission
    • job-id - textual ID which should be unique in whole recodex
    • language - no specific function, just for debugging and clarity
    • file-collector - address from which fetch tasks will download data
    • log - default is false, can be omitted, determines whether job execution will be logged into one shared log
  • tasks - list (not map) of individual tasks
    • task-id - unique indetifier of task in scope of one submission
    • priority - higher number, higher priority
    • fatal-failure - if true, than execution of whole job will be stopped after failing of this one
    • dependencies - list of dependencies which have to be fulfilled before this task, can be omitted if there is no dependencies
    • cmd - description of command which will be executed
      • bin - the binary itself (full path of external command or name of internal task)
      • args - list of arguments which will be sent into execution unit
    • test-id - ID of the test this task is part of - must be specified for tasks which the particular test's result depends on
    • type - type of the task, can be omitted, default value is inner - possible values are: inner, initiation, execution, evaluation
    • sandbox - wrapper for external tasks which will run in sandbox, if defined task is automatically external
      • name - name of used sandbox
      • stdin - file to which standard input will be redirected, can be omitted
      • stdout - file to which standard output will be redirected, can be omitted
      • stderr - file to which error output will be redirected, can be omitted
      • limits - list of limits which can be passed to sandbox
        • hw-group-id - determines specific limits for specific machines
        • time - time of execution in second
        • wall-time - wall time in seconds
        • extra-time - extra time which will be added to execution
        • stack-size - size of stack of executed program in kilobytes
        • memory - overall memory limit for application in kilobytes
        • parallel - integral number of processes which can run simultaneously, time and memory limits are merged from all potential processes/threads
        • disk-size - size of all io operations from/to files in kilobytes
        • disk-files - number of files which can be opened
        • environ-variable - wrapper for map of environmental variables, union with default worker configuration
        • chdir - this will be working directory of executed application
        • bound-directories - list of structures reprezenting directories which will be visible inside sandbox, union with default worker configuration. Contains 3 suboptions: src - source pointing to actual system directory, dst - destination inside sandbox which can have its own filesystem binding and mode - determines connection mode of specified directory, one of values: RW, NOEXEC, FS, MAYBE, DEV

Configuration example

This configuration example is written in YAML and serves only for demostration purposes. Some items can be omitted and defaults from worker configuration will be used.

---
submission:  # happy hippoes fence
    job-id: hippoes
    language: c
    file-collector: http://localhost:9999/tasks
    log: true
tasks:
    - task-id: "compilation"
      priority: 2
      fatal-failure: true
      cmd:
          bin: "/usr/bin/gcc"
          args:
              - "solution.c"
              - "-o"
              - "a.out"
      sandbox:
          name: "isolate"
          limits:
              - hw-group-id: group1
                parallel: 0
                chdir: ${EVAL_DIR}
                bound-directories:
                    - src: ${SOURCE_DIR}
                      dst: ${EVAL_DIR}
                      mode: RW
    - task-id: "fetch_test_1"
      priority: 4
      fatal-failure: false
      dependencies:
          - compilation
      cmd:
          bin: "fetch"
          args:
              - "1.in"
              - "${SOURCE_DIR}/kuly.in"
    - task-id: "evaluation_test_1"
      priority: 5
      fatal-failure: false
      dependencies:
          - fetch_test_1
      cmd:
          bin: "a.out"
      sandbox:
          name: "isolate"
          limits:
              - hw-group-id: group1
                time: 0.5
                memory: 8192
                chdir: ${EVAL_DIR}
                bound-directories:
                    - src: ${SOURCE_DIR}
                      dst: ${EVAL_DIR}
                      mode: RW
    - task-id: "fetch_test_solution_1"
      priority: 6
      fatal-failure: false
      dependencies:
          - evaluation_test_1
      cmd:
          bin: "fetch"
          args:
              - "1.out"
              - "${SOURCE_DIR}/1.out"
    - task-id: "judging_test_1"
      priority: 7
      fatal-failure: false
      dependencies:
          - fetch_test_solution_1
      cmd:
          bin: "${JUDGES_DIR}/recodex-judge-normal"
          args:
              - "1.out"
              - "plot.out"
      sandbox:
          name: "isolate"
          limits:
              - hw-group-id: group1
                parallel: 0
                chdir: ${EVAL_DIR}
                bound-directories:
                    - src: ${SOURCE_DIR}
                      dst: ${EVAL_DIR}
                      mode: RW
    - task-id: "rm_junk_test_1"
      priority: 8
      fatal-failure: false
      dependencies:
          - judging_test_1
      cmd:
          bin: "rm"
          args:
              - "${SOURCE_DIR}/kuly.in"
              - "${SOURCE_DIR}/plot.out"
              - "${SOURCE_DIR}/1.out"
...

Job variables

Because frontend does not know which worker gets the job, its necessary to be a little general in configuration file. This means that some worker specific things has to be transparent. Good example of this is that some (evaluation) directories may be placed differently across all workers. To provide a solution, variables were established. There are of course some restrictions where variables can be used. Basically whenever filesystem paths can be used, variables can be used.

Usage of variables in configuration is simple and kind of shell-like. Name of variable is put inside braces which are preceded with dollar sign. Real usage is than something like this: ${VAR}. There should be no quotes or apostrophies around variable name, just simple text in braces. Parsing is simple and whenever there is dollar sign with braces job execution unit automatically assumes that this is a variable, so there is no chance to have this kind of substring anywhere else.

List of usable variables in job configuration:

  • WORKER_ID - integral identification of worker, unique on server
  • JOB_ID - identification of this job
  • SOURCE_DIR - directory where source codes of job are stored
  • EVAL_DIR - evaluation directory which should point inside sandbox. Note, that some existing directory must be bound inside sanbox under EVAL_DIR name using bound-directories directive inside limits section.
  • RESULT_DIR - results from job can be copied here, but only with internal task
  • TEMP_DIR - general temp directory which is not dependent on operating system
  • JUDGES_DIR - directory in which judges are stored (outside sandbox)

Directories and Files

For each job execution unique directory structure is created. Job is not restricted to use only specified directories (tasks can do whatever is allowed on system), but it is advised to use them inside a job. DEFAULT variable represents worker's working directory specified in its configuration. No variable of this name is defined for use in job YAML configuration, it's used just for this example.

List of temporary files for job execution:

  • ${DEFAULT}/downloads/${WORKER_ID}/${JOB_ID} -- where the downloaded archive is saved
  • ${DEFAULT}/submission/${WORKER_ID}/${JOB_ID} -- decompressed submission is stored here
  • ${DEFAULT}/eval/${WORKER_ID}/${JOB_ID} -- this directory is accessible in job configuration using variables and all execution should happen here
  • ${DEFAULT}/temp/${WORKER_ID}/${JOB_ID} -- directory where all sort of temporary files can be stored
  • ${DEFAULT}/results/${WORKER_ID}/${JOB_ID} -- again accessible directory from job configuration which is used to store all files which will be upload on fileserver, usually there will be only yaml result file and optionally log, every other file has to be copied here explicitly from job

Results

Results of tasks are sent back in YAML format compressed into archive. This archive can contain further files, such as job logging information and files which were explicitly copied into results directory. Results file contains job identification and results of individual tasks.

Results items

List of items from results file. Mandatory items are bold, optional ones italic.

  • job-id - identification of job to which this results belongs
  • error_message - present only if whole execution failed and none of tasks were executed
  • results - list of tasks results
    • task-id - unique identification of task in scope of this job
    • status - three states: OK, FAILED, SKIPPED
    • error_message - defined only in internal tasks on failure
    • sandbox_results - if defined than this task was external and was run in sandbox
      • exitcode - integer which executed program gave on exit
      • time - time in seconds in which program exited
      • wall-time - wall time in seconds
      • memory - how much memory program used in kilobytes
      • max-rss - maximum resident set size used in kilobytes
      • status - two letter status code: OK, RE, SG, TO, XX
      • exitsig - description of exit signal
      • killed - boolean determining if program exited correctly or was killed
      • message - status message on failure

Example result file

---
job-id: 5
results:
	- task-id: compile1
	  status: OK
	  sandbox_results:
		  exitcode: 0
		  time: 5
		  wall-time: 5
		  memory: 50000
		  max-rss: 50000
		  status: RE
		  exitsig: 1
		  killed: true
		  message: "Time limit exceeded"
	- task-id: eval1
	  status: FAILED
	  error_message: "Task failed, something very bad happend!"
	  .
	  .
	  .
...

Scoring

Every assignment consists of tasks. Only some tasks however are part of the evaluation. Those tasks are grouped into tests. Each task might be assigned a test-id parameter, as described above. Every test must consist of at least two tasks: execution and evaluation by a judge. The former retrieves information about the execution such as elapsed time and memory consumed, the latter result with a score - float between 0 and 1. There may be more than one execution tasks, but evaluation task must be exactly one.

Total resulting score of the assignment submission is then calculated according to a supplied score config (described below). Total score is also a float between 0 and 1. This number is then multiplied by the maximum of points awarded for the assignment by the teacher assigning the exercise - not the exercise author.

Simple score calculation

First implemented calculator is simple score calculator with test weights. This calculator just looks at the score of each test and put them together according to the test weights specified in assignment configuration. Resulting score is calculated as a sum of products of score and weight of each test divided by the sum of all weights. The algorithm in Python would look something like this:

sum = 0
weightSum = 0
for t in tests:
  sum += t.score * t.weight
  weightSum += t.weight
score = sum / weightSum

Sample score config in YAML format:

testWeights:
  a: 300   # test with id 'a' has a weight of 300
  b: 200
  c: 100
  d: 100

Logs

During the execution tasks can use one shared log. There is no use for multiple logs, one per task for example, because of pretty small amount of information loged. By default loging is disabled, enabling can be done in job configuration.

After execution the log is packed with results into archive and sent back to fileserver. So the log can be found here for further processing.

Case study

We present some of the courses that might use ReCodEx to evaluate homework assignments and outline the setup of the evaluation with respect to the concept of stages.

Simple programming exercises

For example introductory programming courses such as Programming I or Java programming.

In the simplest case we only need one stage that builds the program and passes the test inputs to its standard input. Outputs are compared with the default judge.

Compiler principles

This course uses multiple tools in a pipeline-like fashion - for example flex and bison.

We create a stage for each of the steps of this pipeline - we run flex and test the output, then we run bison on top of previous stage results and do the same. This is more advanced configuration and ReCodEx is specifically designed to support such evaluation pipeline.

XML technologies

In this course, students choose a topic they model using XML -- for example a library or a bulletin board. During the semester, they expand this project by adding XSLT transformations, XQuery scripts, XPath queries, etc. These are tested against fixed requirements (e.g. using some particular language constructs).

This course already has a rather sophisticated application for testing homework assignments, so we only include it for demonstration purposes.

Because every assignment focuses on a different technology, we would need a new type of stage for each one. These stages would only run some checker programs against the submitted sources (and possibly try to check their syntax etc.). ReCodEx is not primarily determined to perform static analysis, but surely it's also possible.

Non-procedural programming

This course is different from other programming courses, because it only teaches input/output manipulation by the end of the semester. In their assignments, students are mostly required to write a function/predicate that behaves according to a specification (e.g. appends an item at the end of a list).

Due to this, we need to take the function submitted by a student and combine it with a snippet of code that reads the standard input and calls the submitted function. This could be nicely achieved by setting the build command.

Operating systems

The operating systems course requires students to work on a simple OS kernel that is then run in a MIPS simulator called msim. There are various tests that checks if the student's implementation of core OS mechanisms is correct. These tests are compiled into the kernel.

Each of these tests could be represented by a stage that compiles the kernel with the test and then runs it against different configurations of msim.

Submission flow

This article will describe in detail execution flow of submission from the point of submission into web application to the point of evaluation of results from execution. Only hot path is considered in following description.

Web Application

First thing user has to submit his/her solution to web application. Generally web application has to store submitted files and hand over all needed information about submission to broker. More detailed description follows:

  • user submits his solution to web application
  • T
  • O
  • D
  • O
  • .
  • .
  • .
  • .

Broker

Broker gets information about new submission from web application. At this point broker has to find suitable worker for execution of this particular submission. When worker is found and is jobless, then broker send detailed submission to worker to evaluation. More detailed description follows:

  • broker gets multipart "eval" message from web application with job identification, source archive URL, result URL and appropriate worker headers
  • headers are parsed and worker which matches all of them is chosen as the one which will execute incoming submission
  • whole execution request is saved into worker structure to waiting queue
  • if chosen worker is not working right now then incoming request is forwarded directly from waiting queue to worker through multipart message
  • if worker queue is not empty then nothing is done right now
    • execution of this particular request is suspended until worker complete all previous requests

Worker

Worker gets request from broker to evaluate particular submission. Next step is to evaluate given submission and results upload to fileserver. After this worker only send broker that submission was evaluated. More detailed description follows:

  • "listening" thread gets multipart message from broker with command "eval"
  • "listening" thread hand over whole message through inproc socket to "execution" thread
  • "execution" thread now has to prepare all things and get ready for execution
  • temporary folders names are initated (but not created) this includes folder with source files, folder with downloaded submission, temporary directory for all possible types of files and folder which will contain results from execution
  • if some of the above stated folders is already existing, then it's deleted
  • after successfull initiation submission archive is downloaded to created folder
  • submission archive is decompressed into submission files folder
  • all files from decompressed archive are copied into evaluation directory which can be used for execution in sandboxes
  • all other folders which were not created are created just now
  • it's time to build job from configuration
  • job configuration file is located in evaluation directory if exists and is loaded using yaml-cpp library
  • loaded configuration is now parsed into job_metadata structure which is handed over to job execution class itself
  • job execution class will now initialize and construct particular tasks from job_metadata into task tree
  • if there is some item which can use variables (e.g. binary path, cmd arguments, bound directories) it is done at this point
  • all tasks from configuration are created and divided into external or internal tasks
    • external tasks have to load limits from configuration for this workers hwgroup which was loaded from worker configuration
      • if limits were not defined default worker limits are loaded
    • internal tasks are just created without any further processing like external tasks
  • after successfull creation of all tasks, they are connected into graph structure using dependencies
  • next and last step of building job structure is to execute topological sorting and get queue of tasks which will be executed in order
  • topological sorting take into account priority of tasks and sequential order from job configuration
  • running all tasks in order follows
  • after that results have to be obtained from all executed tasks and given into corresponding yaml file
  • result yaml file alongside with content of result folder is sent back to fileserver in compressed form
  • of course there has to be cleaning after whole evaluation which will deinitialize all needed variables and also delete all used temporary folders
  • all of previous was in "execution" thread which now have to tell "listening" thread that execution is done
  • this is done through multipart message "done" with packed job identification addressed to "listening" thread
  • action of "listening" is now pretty straightforward "done" message is resend to broker

Broker

Broker gets done message from worker and basically only mark submission as done in its internal structures. No messages are send to web application, because of lazy evaluation on frontend side. More detailed description follows:

  • broker gets "done" message from worker after successfull execution of job
  • appropriate worker structure is found based on its identification
  • some checks of invariants (current job identification, right amount of arguments) are executed
  • deletion of current execution request in worker structure follows and appropriate worker is now considered free
  • if worker execution queue is not empty than next waiting request is taken and given as current one
    • after that only missing thing is to send that request to worker and loop back to worker execution
  • if worker queue is empty then appropriate worker remains free and waiting for another execution request

Web Application

Only remaining part is evaluation of results. This is provided on demand when user wants them. Results are obtained from fileserver and evaluated. More detailed description follows:

  • evaluation of execution results is provided on user demand
  • T
  • O
  • D
  • O
  • .
  • .
  • .

Installation

Ansible installer

Hints to manuall install

This page contains steps how to set up a computer on which some parts of ReCodEx may run. Most steps are listed in two variants, for Red Hat based distributions (like RHEL, Centos or Fedora) and Debian based distibutions. Before starting, make sure you have completed basic OS installation and set up, including users and logins, SSH, Git, firewall, etc.

Repositories

Add testing repositories to Debian OS

  • Create file /etc/apt/apt.conf with content
APT::Default-Release "stable";
  • Add testing repos to /etc/apt/sources.list
deb http://ftp.cz.debian.org/debian/ testing main contrib non-free
deb-src http://ftp.cz.debian.org/debian/ testing contrib non-free
  • Install packages with -t testing option. For example
$ sudo apt-get -t testing install gcc

Add EPEL repository to RHEL

  • See this for instructions.

Installation of dependencies

Install as new version of each package as possible, so mostly Debian packages are from testing repositories and RHEL packages are the newest ones from EPEL repositories.

Basic development tools

  • g++
  • cmake
  • make
  • git

Install ZeroMQ in version at least 4.0

  • Debian package is libzmq3-dev.
  • RedHat packages are zeromq and zeromq-devel.

Install YAML-CPP library

  • Debian packages: libyaml-cpp0.5v5 and libyaml-cpp-dev.
  • RedHat packages are yaml-cpp and yaml-cpp-devel.

Install libcurl library

  • Debian package is libcurl4-gnutls-dev.
  • RedHat package is libcurl-devel.

Security