Skip to content
Commit 2b3f4b90 authored by Karsten Loesing's avatar Karsten Loesing
Browse files

Be smarter about re-importing consensuses.

A recent analysis of Metrics' back-end performance has revealed that
importing consensuses into the database can take between a few seconds
and a few *hours*.  More precisely, importing a consensus for the
first time takes seconds and re-importing a consensus that was already
(partially) contained in the database can take hours.  The reason for
the latter is that we're checking for every status entry whether it's
contained in the database before we're inserting it, and these 7k
queries are crazy expensive.  What we should do, which is what we're
doing now, is request and store a list of fingerprints of contained
status entry for a given consensus and only inserting a status entry
if its fingerprint is not contained in that list.  Now we can avoid
making these 7k queries and re-import a consensus within seconds.

There were two situations when we re-imported one or more consensuses
which took hours or more: whenever the host was rebooted during the
database import and we lost import history, and whenever CollecTor
fetched an outdated consensus from a directory authority that it
already received the hour before and that Metrics already imported in
its previous run.
parent 5611ef45
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment