From f453a3978b34232a8b0152ea28f1f76906e26075 Mon Sep 17 00:00:00 2001 From: Martin Polanka Date: Sun, 22 Jan 2017 23:55:31 +0100 Subject: [PATCH] Cleaner reorg --- Rewritten-docs.md | 68 +++++++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 26 deletions(-) diff --git a/Rewritten-docs.md b/Rewritten-docs.md index 9bdd59a..ac72802 100644 --- a/Rewritten-docs.md +++ b/Rewritten-docs.md @@ -1352,27 +1352,38 @@ cleaner completes particular server specific caching system. Cleaner as mentioned is simple script which is executed regularly as a cron job. If there is caching system like it was introduced in paragraph above there are -little possibilities how cleaner should be implemented. On various filesystems -there is usually support for two particular timestamps, `last access time` and -`last modification time`. Files in cache are once downloaded and then just -copied, this means that last modification time is set only once on creation of -file and last access time should be set every time on copy. This imply last -access time is what is needed here. But last modification time is widely used by -operating systems, on the other hand last access time is not by default. More on -this subject can be found -[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime). -For proper cleaner functionality filesystem which is used by worker for caching -has to have last access time for files enabled. +little possibilities how cleaner should be implemented. + +On various filesystems there is usually support for two particular timestamps, +`last access time` and `last modification time`. Files in cache are once +downloaded and then just copied, this means that last modification time is set +only once on creation of file and last access time should be set every time on +copy. From this we can conclude that last access time is what is needed here. + +But unlike last modification time, last access time is not usually enabled on +conventional filesystems (more on this subject can be found +[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)). +So if we choose to use last access time, filesystem used for cache folder has to +have last access time for files enabled. Last access time was chosen for +implementation in ReCodEx but this might change in further releases. + +However, there is another way, last modification time which is broadly supported +can be used. But this solution is not automatic and worker would have to 'touch' +cache files whenever they are accessed. This solution is maybe a bit better than +the one with last access time and might be implemented in future releases. + +#### Caching flow Having cleaner as separated component and caching itself handled in worker is -kind of blurry and is not clearly observable that it works without any race -conditions. The goal here is not to have system without races but to have system -which can recover from them. +kind of blurry and is not clearly observable that it works without problems. +The goal is to have system which can recover from every kind of errors. -#### Caching flow +Follows description of one possible implementation. This whole mechanism relies +on worker ability to recover from internal fetch task failure. In case of error +here job will be reassigned to another worker where problem hopefully does not +arise. -Follows description of one possible robust implementation. First start with -worker implementation: +First start with worker implementation: - worker discovers fetch task which should download supplementary file - worker takes name of file and tries to copy it from cache folder to its @@ -1380,8 +1391,8 @@ worker implementation: - if successful then last access time should be rewritten (by filesystem itself) and whole operation is done - if not successful then file has to be downloaded - - file is downloaded from fileserver to working folder - - downloaded file is then copied to cache + - file is downloaded from fileserver to working folder and then + copied to cache Previous implementation is only within worker, cleaner can anytime intervene and delete files. Implementation in cleaner follows: @@ -1397,13 +1408,18 @@ delete files. Implementation in cleaner follows: Previous description implies that there is gap between detection of last access time and deleting file within cleaner. In the gap there can be worker which will access file and the file is anyway deleted but this is fine, file is deleted but -worker has it copied. Another problem can be with two workers downloading the -same file, but this is also not a problem file is firstly downloaded to working -folder and after that copied to cache. And even if something else unexpectedly -fails and because of that fetch task will fail during execution even that should -be fine. Because fetch tasks should have 'inner' task type which implies that -fail in this task will stop all execution and job will be reassigned to another -worker. It should be like the last salvation in case everything else goes wrong. +worker has it copied. If worker does not copy whole file or even do not start to +copy it and the file is deleted then copy process will fail. This will cause +internal task failure which will be handled by reassigning job to another +worker. + +Another problem can be with two workers downloading the same file, but this is +also not a problem, file is firstly downloaded to working folder and after that +copied to cache. + +And even if something else unexpectedly fails and because of that fetch task +will fail during execution, even that should be fine as mentioned previsously. +This should be the last salvation in case everything else goes wrong. ### Monitor