<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/iwakeh/collector, branch task-25317</title>
<subtitle>iwakeh's personal collector repository</subtitle>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/'/>
<entry>
<title>fixup! Circumvent Collection (integer) size limit.</title>
<updated>2018-02-20T16:30:17+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2018-02-20T16:30:17+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=23c2df8d6cebb5e39b7130e5dbae2bb5415f606f'/>
<id>23c2df8d6cebb5e39b7130e5dbae2bb5415f606f</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Circumvent Collection (integer) size limit.</title>
<updated>2018-02-20T16:30:16+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2018-02-20T16:30:14+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=1d55d819aaf7c6ff7c0aeffcaeba3ac2786536ce'/>
<id>1d55d819aaf7c6ff7c0aeffcaeba3ac2786536ce</id>
<content type='text'>
Clean log lines immediately when they are read and also make use of sanitized
log's high redundancy immediately, i.e., continue with maps of
&lt;LocalDate, &lt;Map&lt;String, Long&gt;&gt;.

Rename method(s) to reflect what they do.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Clean log lines immediately when they are read and also make use of sanitized
log's high redundancy immediately, i.e., continue with maps of
&lt;LocalDate, &lt;Map&lt;String, Long&gt;&gt;.

Rename method(s) to reflect what they do.
</pre>
</div>
</content>
</entry>
<entry>
<title>Reduce memory footprint and wall time.</title>
<updated>2018-02-20T16:30:13+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2018-02-20T16:30:09+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=8557bf6255e6e3745088033e8e7bad7801421686'/>
<id>8557bf6255e6e3745088033e8e7bad7801421686</id>
<content type='text'>
Adapt to latest changes of metrics-lib (task-25329) and make use of the high
redundancy of logs (e.g. a 3G file might only contain 350 different lines).
This avoids OOM and array out of bounds exceptions for large files (&gt;2G) and
gives a speed-up of roughly 50%. (The earlier 66min are down to 34min for
meronense&amp;weschniakowii files plus two larger files.)

There is a BATCH constant, which could be tuned for processing speed. It is
logged for each webstats module run.  Currently, it is set to 100k.  This
was more or less arbitrarily chosen and used for all the tests.  A test run
using 500k didn't show significant differences.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adapt to latest changes of metrics-lib (task-25329) and make use of the high
redundancy of logs (e.g. a 3G file might only contain 350 different lines).
This avoids OOM and array out of bounds exceptions for large files (&gt;2G) and
gives a speed-up of roughly 50%. (The earlier 66min are down to 34min for
meronense&amp;weschniakowii files plus two larger files.)

There is a BATCH constant, which could be tuned for processing speed. It is
logged for each webstats module run.  Currently, it is set to 100k.  This
was more or less arbitrarily chosen and used for all the tests.  A test run
using 500k didn't show significant differences.
</pre>
</div>
</content>
</entry>
<entry>
<title>Adapt CollecTor to latest metrics-lib master branch.</title>
<updated>2018-02-20T16:30:08+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2018-02-20T16:30:08+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=fbb35f75da022a23912b937b1825d8f216abad07'/>
<id>fbb35f75da022a23912b937b1825d8f216abad07</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add hasContent method to make even more use of DescriptorBuilder.</title>
<updated>2018-02-20T16:30:07+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2017-10-27T17:35:19+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=5b68aaf8aa7c5f3769544061344e75f7884e87ef'/>
<id>5b68aaf8aa7c5f3769544061344e75f7884e87ef</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Make logging statements comply to Metrics' standards.</title>
<updated>2018-02-20T16:30:07+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2017-10-27T17:35:18+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=43cd15876635d763d0f6adbf6bcc5c7df6380406'/>
<id>43cd15876635d763d0f6adbf6bcc5c7df6380406</id>
<content type='text'>
Also edit here and there for more readability and less lines.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also edit here and there for more readability and less lines.
</pre>
</div>
</content>
</entry>
<entry>
<title>Use DescriptorBuilder more often.</title>
<updated>2018-02-20T16:30:07+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2017-10-27T17:35:17+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=4e61bb792bc4cd4db9df6eb49ab88890b34ff489'/>
<id>4e61bb792bc4cd4db9df6eb49ab88890b34ff489</id>
<content type='text'>
Add convenience constructor accepting the first string as argument.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add convenience constructor accepting the first string as argument.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add a finalized state to DescriptorBuilder.</title>
<updated>2018-02-20T16:30:03+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2017-10-27T17:35:16+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=afe07d8efd4dc94b9dfb9b5896002286ba71dc6d'/>
<id>afe07d8efd4dc94b9dfb9b5896002286ba71dc6d</id>
<content type='text'>
To avoid possible inconsistencies DescriptorBuilder is finalized after the first
call to 'toString' and cannot be altered anymore.  Any attempt to add more leads
to an IllegalStateException.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To avoid possible inconsistencies DescriptorBuilder is finalized after the first
call to 'toString' and cannot be altered anymore.  Any attempt to add more leads
to an IllegalStateException.
</pre>
</div>
</content>
</entry>
<entry>
<title>Use Java8 idiom for toString method.</title>
<updated>2018-02-20T16:29:39+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2017-10-27T17:35:15+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=2457eb5be72d508c4ec4e2d2c3b6f7a88c69ed4c'/>
<id>2457eb5be72d508c4ec4e2d2c3b6f7a88c69ed4c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Make DescriptorBuilder also accept DescriptorBuilders.</title>
<updated>2018-02-20T16:29:39+00:00</updated>
<author>
<name>iwakeh</name>
<email>iwakeh@torproject.org</email>
</author>
<published>2017-10-27T17:35:14+00:00</published>
<link rel='alternate' type='text/html' href='https://gitweb.torproject.org/user/iwakeh/collector.git/commit/?id=fbfa16c05b3f74acd60ccdf780568e7e1b0b9e1b'/>
<id>fbfa16c05b3f74acd60ccdf780568e7e1b0b9e1b</id>
<content type='text'>
This might facilitate easier processing of descriptors.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This might facilitate easier processing of descriptors.
</pre>
</div>
</content>
</entry>
</feed>
