You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
recodex-wiki/Internal-architecture-(orig...

2.8 KiB

File Server

File Server stores data, that should be kept outside of WebApp's database (both because storing files in a database is inefficient and because the workers need to access the files in the simplest possible way). It should meet following requirements:

  • store files without duplicates
  • keep consistent state with main database
  • serve files to workers on demand
  • allow versioning of tasks with revert back feature

To meet these requirements, Storage and Database must be set as bellow.

Storage

Storage is meant as disc space with some commonly used filesystem. We'll use ext4, but the other ones should work too. Storage file structure is:

.
├── submits
│   └── user_id
│       └── advanced_dot_net_1
│           └── submit_id
│               ├── eval.yml
│               └── source.cs
├── submit_archives
│   └── submit_id.tar.gz
├── tasks
│   ├── a
│   │   ├── a014ed2abb56371bfaf2b4298a85d5dfb56509ed
│   │   └── a5edbd8b12e670ed1e3110d6c0524000cd4c3c7a
│   └── b
│       └── b1696358b8540923eb79b68f95c0f94c13a83fa7
└── temp
    └── 1795184136b8bdddabe50453cc2cc2d46f0f7c5e
  • submits keep information about all files submited by users to ReCodEx. There are subdirectories user_id and advanced_dot_net_1 which groups submits by users and courses the submits are for. This structure is easy to maintain for new and deleted users.
  • submit_archives contains the student submissions in compressed archives so that they can be easily downloaded by workers.
  • tasks contains supplementary files (such as test inputs or helper programs) for all existing task in ReCodEx. To avoid too many files in one directory, files are separated to subfolders by first character of their name.
  • temp directory is dedicated to temporary storing outputs of programs on teachers' demand. This directory will be erased by cron job on daily basis.

Database

For user friendly access and modifying tasks following information should be stored in database:

  • list of tasks with their newest version number
  • for every task and version list of used files (their hashed names)
  • for every hash name one human readable filename

Conclusion

Files are internally stored by their sha1sum hashes, so it's easy to implement versioning and get rid of files with duplicate content (multiple files can have the same content, which is only stored once). Worker also uses files by their hashes, which is great for local caching without worries about actual version number of given file. On the other hand, Database stores information about human readable names, so that the files are presented in a friendly way to users (teachers) in WebApp.