Skip to content
Snippets Groups Projects
Commit d7117f8c authored by Karsten Loesing's avatar Karsten Loesing
Browse files

Avoid reprocessing webstats files.

Web servers typically provide us with the last 14 days of request
logs. We shouldn't process the whole 14 days over and over. Instead we
should only process new logs files and any other log files containing
log lines from newly written dates.

In some cases web servers stop serving a given virtual host or stop
acting as web server at all. However, in these cases we're left with
14 days of logs per virtual host. Ideally, these logs would get
cleaned up, but until that's the case, we should at least not
reprocess these files over and over.

In order to avoid reprocessing webstats files, we need a new state
file with log dates contained in given input files. We use that state
file to determine which of the previously processed webstats files to
re-process, so that we can write complete daily logs.
parent d5ca95a2
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment