Cleaner reorg

master
Martin Polanka 8 years ago
parent 6048f50293
commit f453a3978b

@ -1352,27 +1352,38 @@ cleaner completes particular server specific caching system.
Cleaner as mentioned is simple script which is executed regularly as a cron job.
If there is caching system like it was introduced in paragraph above there are
little possibilities how cleaner should be implemented. On various filesystems
there is usually support for two particular timestamps, `last access time` and
`last modification time`. Files in cache are once downloaded and then just
copied, this means that last modification time is set only once on creation of
file and last access time should be set every time on copy. This imply last
access time is what is needed here. But last modification time is widely used by
operating systems, on the other hand last access time is not by default. More on
this subject can be found
[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime).
For proper cleaner functionality filesystem which is used by worker for caching
has to have last access time for files enabled.
little possibilities how cleaner should be implemented.
On various filesystems there is usually support for two particular timestamps,
`last access time` and `last modification time`. Files in cache are once
downloaded and then just copied, this means that last modification time is set
only once on creation of file and last access time should be set every time on
copy. From this we can conclude that last access time is what is needed here.
But unlike last modification time, last access time is not usually enabled on
conventional filesystems (more on this subject can be found
[here](https://en.wikipedia.org/wiki/Stat_%28system_call%29#Criticism_of_atime)).
So if we choose to use last access time, filesystem used for cache folder has to
have last access time for files enabled. Last access time was chosen for
implementation in ReCodEx but this might change in further releases.
However, there is another way, last modification time which is broadly supported
can be used. But this solution is not automatic and worker would have to 'touch'
cache files whenever they are accessed. This solution is maybe a bit better than
the one with last access time and might be implemented in future releases.
#### Caching flow
Having cleaner as separated component and caching itself handled in worker is
kind of blurry and is not clearly observable that it works without any race
conditions. The goal here is not to have system without races but to have system
which can recover from them.
kind of blurry and is not clearly observable that it works without problems.
The goal is to have system which can recover from every kind of errors.
#### Caching flow
Follows description of one possible implementation. This whole mechanism relies
on worker ability to recover from internal fetch task failure. In case of error
here job will be reassigned to another worker where problem hopefully does not
arise.
Follows description of one possible robust implementation. First start with
worker implementation:
First start with worker implementation:
- worker discovers fetch task which should download supplementary file
- worker takes name of file and tries to copy it from cache folder to its
@ -1380,8 +1391,8 @@ worker implementation:
- if successful then last access time should be rewritten (by filesystem
itself) and whole operation is done
- if not successful then file has to be downloaded
- file is downloaded from fileserver to working folder
- downloaded file is then copied to cache
- file is downloaded from fileserver to working folder and then
copied to cache
Previous implementation is only within worker, cleaner can anytime intervene and
delete files. Implementation in cleaner follows:
@ -1397,13 +1408,18 @@ delete files. Implementation in cleaner follows:
Previous description implies that there is gap between detection of last access
time and deleting file within cleaner. In the gap there can be worker which will
access file and the file is anyway deleted but this is fine, file is deleted but
worker has it copied. Another problem can be with two workers downloading the
same file, but this is also not a problem file is firstly downloaded to working
folder and after that copied to cache. And even if something else unexpectedly
fails and because of that fetch task will fail during execution even that should
be fine. Because fetch tasks should have 'inner' task type which implies that
fail in this task will stop all execution and job will be reassigned to another
worker. It should be like the last salvation in case everything else goes wrong.
worker has it copied. If worker does not copy whole file or even do not start to
copy it and the file is deleted then copy process will fail. This will cause
internal task failure which will be handled by reassigning job to another
worker.
Another problem can be with two workers downloading the same file, but this is
also not a problem, file is firstly downloaded to working folder and after that
copied to cache.
And even if something else unexpectedly fails and because of that fetch task
will fail during execution, even that should be fine as mentioned previsously.
This should be the last salvation in case everything else goes wrong.
### Monitor

Loading…
Cancel
Save