fileserver analysis

8 years ago · 73bea47981
parent 1c44f40547
commit 73bea47981
1 changed files with 36 additions and 12 deletions
--- a/Rewritten-docs.md
+++ b/Rewritten-docs.md
@ -1250,18 +1250,42 @@ term project for C# course so it might be written and integrated in future.
 The fileserver provides access to a shared storage space that contains files
 submitted by students, supplementary files such as test inputs and outputs and
-results of evaluation. This functionality can be easily separated from the rest
+results of evaluation. In other words, it acts as an intermediate node for data
-of the backend features, which led to designing the fileserver as a
+passed between the frontend and the backend. This functionality can be easily
-standalone component. Such design helps encapsulate the details of how the files
+separated from the rest of the backend features, which led to designing the
-are stored (e.g. on a file system, in a database or using a cloud storage
+fileserver as a standalone component. Such design helps encapsulate the details
-service), while also making it possible to share the storage between multiple
+of how the files are stored (e.g. on a file system, in a database or using a
-ReCodEx frontends.
+cloud storage service), while also making it possible to share the storage
-
+between multiple ReCodEx frontends.
-@todo: mention hashing on fileserver and why this approach was chosen
+
-
+For early releases of the system, we chose to store all files on the file system
-@todo: what can be stored on fileserver
+-- it is the least complicated solution (in terms of implementation complexity)
-
+and the storage backend can be rather easily migrated to a different technology.
-@todo: how can jobs be stored on fileserver, mainly mention that it is nonsense to store inputs and outputs within job archive
+
 One of the facts we learned from CodEx is that many exercises share test input
 and output files, and also that these files can be rather large (hundreds of
 megabytes). A direct consequence of this is that we cannot add these files to
 submission archives that are to be downloaded by workers -- the combined size of
 the archives would quickly exceed gigabytes, which is impractical. Another
 conclusion we made is that a way to deal with duplicate files must be
 introduced.
 A simple solution to this problem is storing supplementary files under the
 hashes of their content. This ensures that every file is stored only once. On
 the other hand, it makes it more difficult to understand what the content of a
 file is at a glance, which might prove problematic for the administrator.
 A notable part of the fileserver's work is done by a web server (e.g. listening
 to HTTP requests and caching recently accessed files in memory for faster
 access). What remains to be implemented is handling requests that upload files
 -- student submissions should be stored in archives to facilitate simple
 downloading and supplementary exercise files need to be stored under their
 hashes.
 We decided to use Python and the Flask web framework. This combination makes it
 possible to express the logic in ~100 SLOC and also provides means to run the
 fileserver as a standalone service (without a web server), which is useful for
 development.
 ### Monitor