fileserver analysis

master
Teyras 8 years ago
parent 1c44f40547
commit 73bea47981

@ -1250,18 +1250,42 @@ term project for C# course so it might be written and integrated in future.
The fileserver provides access to a shared storage space that contains files The fileserver provides access to a shared storage space that contains files
submitted by students, supplementary files such as test inputs and outputs and submitted by students, supplementary files such as test inputs and outputs and
results of evaluation. This functionality can be easily separated from the rest results of evaluation. In other words, it acts as an intermediate node for data
of the backend features, which led to designing the fileserver as a passed between the frontend and the backend. This functionality can be easily
standalone component. Such design helps encapsulate the details of how the files separated from the rest of the backend features, which led to designing the
are stored (e.g. on a file system, in a database or using a cloud storage fileserver as a standalone component. Such design helps encapsulate the details
service), while also making it possible to share the storage between multiple of how the files are stored (e.g. on a file system, in a database or using a
ReCodEx frontends. cloud storage service), while also making it possible to share the storage
between multiple ReCodEx frontends.
@todo: mention hashing on fileserver and why this approach was chosen
For early releases of the system, we chose to store all files on the file system
@todo: what can be stored on fileserver -- it is the least complicated solution (in terms of implementation complexity)
and the storage backend can be rather easily migrated to a different technology.
@todo: how can jobs be stored on fileserver, mainly mention that it is nonsense to store inputs and outputs within job archive
One of the facts we learned from CodEx is that many exercises share test input
and output files, and also that these files can be rather large (hundreds of
megabytes). A direct consequence of this is that we cannot add these files to
submission archives that are to be downloaded by workers -- the combined size of
the archives would quickly exceed gigabytes, which is impractical. Another
conclusion we made is that a way to deal with duplicate files must be
introduced.
A simple solution to this problem is storing supplementary files under the
hashes of their content. This ensures that every file is stored only once. On
the other hand, it makes it more difficult to understand what the content of a
file is at a glance, which might prove problematic for the administrator.
A notable part of the fileserver's work is done by a web server (e.g. listening
to HTTP requests and caching recently accessed files in memory for faster
access). What remains to be implemented is handling requests that upload files
-- student submissions should be stored in archives to facilitate simple
downloading and supplementary exercise files need to be stored under their
hashes.
We decided to use Python and the Flask web framework. This combination makes it
possible to express the logic in ~100 SLOC and also provides means to run the
fileserver as a standalone service (without a web server), which is useful for
development.
### Monitor ### Monitor

Loading…
Cancel
Save