Commits · HEAD · Legacy / gitolite / user / gsathya / stem

Feb 18, 2013
- Accounting for NULL access by ctypes · 08529723
  Damian Johnson authored Feb 17, 2013
```
Evidently accessing argc can raise a ValueError...

https://trac.torproject.org/8266
```
  08529723
Feb 17, 2013

Providing a string when str() is called on descriptors · 1a099106

Damian Johnson authored Feb 17, 2013

Python 2.x gets pretty confused when an object's __str__ method provides a
unicode string. Calling...

>>> str(desc)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xab' in position 28: ordinal not in range(128)

Providing an ascii str in python 2.x and unicode str in python 3.x. Thanks to
Sathyanarayanan for the catch!

1a099106

Adding copyright headers · d70a2d39

Damian Johnson authored Feb 17, 2013

Uggg I hate IP law. As pointed out by Juan on...

https://trac.torproject.org/7954

... we need copyright headers to properly comply with the requirements of being
under the LGPL. I'm not looking forward to keeping this up to date (and likely
won't), but oh well.

I've added the header to all python source files except the unit and integ
tests (patches welcome if someone wants to spend the time adding those too).
These headers...

* Declare a copyright from the year of the file's creation to now (2013).

* Include Sean if he worked on it (he is the only contributor that hasn't made
  his contributions public domain to avoid copyright headaches).

d70a2d39

Cleaning up our TODO comments · 4a2e7e74
Damian Johnson authored Feb 17, 2013
```
Several of our TODO comments were no longer relevant or could be expanded.
```
4a2e7e74

Treat descriptor archive contents as individual files · 6927e68d

Damian Johnson authored Feb 16, 2013

When the descriptor reader encountered an archive and read non-descriptor
content it stopped reading. This has caused me almost two weeks of headaches in
troubleshooting...

https://trac.torproject.org/8049

Changing the reader's behaviour to instead handle each file within the archive
separately. Thanks to Karsten for catching this!

6927e68d

Feb 16, 2013

Adding get_archive_path() method to descriptors · f83c7efc

Damian Johnson authored Feb 16, 2013

We can't use a TarInfo's 'name' attribute for get_path() since that corresponds
to its location within the archive. That said, I've often wanted both paths so
both fixing get_path() for tarballs and adding a get_archive_path().

f83c7efc

Dropping the 'path' argument from stem.descriptor.parse_file() · 8cdcb088

Damian Johnson authored Feb 15, 2013

File objects have a 'name' attribute that we can use to guess the path. This
isn't entirely reliable, but nothing is...

http://stackoverflow.com/questions/2458676/absolute-path-of-a-file-object

The path argument was only there to support the descriptor reader. Now that
parse_file() is something for our users it's nice to get rid of arguments they
can't use.

8cdcb088

Feb 09, 2013

Accepting "NEVER" expiration in ADDRMAP events · 2a952ec9

Damian Johnson authored Feb 09, 2013

The expiry value in ADDRMAP events can be 'NEVER'. This is a little troublesome
since it means that the field might or might not be quoted (making this unique
among all tor events).

Caught by Desoxy on 'https://trac.torproject.org/8162'.

2a952ec9

Fixing SingleLineResponse interlinking · dba169f5

Damian Johnson authored Feb 09, 2013

The SingleLineResponse class wasn't in the module's __all__, causing it to not
appear in the sphinx output.

dba169f5

Improving stem.response.convert() pydocs · 74b60f75

Damian Johnson authored Feb 09, 2013

The convert() pydocs were pretty clunky. Replacing the listing with a nice
table mapping the response_type to classes, like what we do elsewhere.

74b60f75

Adding a ControlMessage.from_str() function · 5f8b7b42

Damian Johnson authored Feb 09, 2013

In discussions with Mike about using stem for txtorcon a major use has been
response parsing. Using stem for this is dead easy, but requires a hack. Adding
a function to negate the need for hackery.

5f8b7b42

Feb 08, 2013

Noting '(Tor_internal)' addresses in the pydocs · f96d5f64

Damian Johnson authored Feb 08, 2013

Noting that StreamEvents can have '(Tor_internal)' as a target address.

Spec change:
https://gitweb.torproject.org/torspec.git/commitdiff/3ad9d19e03bd816e1e0f0b9eeb839ee1eedcaedf

f96d5f64

Using numeric 'flag-thresholds' values · ea521286

Damian Johnson authored Feb 08, 2013

Now that the spec has been revised to specify numeric values we can provide
'flag => int/float' mappings (which are much nicer for our users).

Spec change:
https://gitweb.torproject.org/torspec.git/commitdiff/52d0eb4858ad3eb191df3afe324f43683467ae22

ea521286

Feb 06, 2013

Minor stylistic corrections · e8784466

Damian Johnson authored Feb 06, 2013

Couple PEP8 bugs that slipped in concerning spacing between code and inline
comments.

e8784466

Avoiding static /tmp usage · 3687dde6

Damian Johnson authored Feb 06, 2013

Our tests had static /tmp paths at a couple places. Issue caught by Dererk and
patch by Abhishek...

https://trac.torproject.org/7926

3687dde6

Feb 05, 2013

Support for 'flag-thresholds' lines in network status votes · 4e8aaa4d

Damian Johnson authored Feb 05, 2013

Parsing the new 'flag-thresholds' in network status votes - thanks to Karsten
for pointing this out.

metrics-lib change:
https://gitweb.torproject.org/metrics-lib.git/commitdiff/c2a0dbf8bf100a19660ad512b88d93f3d7c18a1e

dir-spec addition:
https://trac.torproject.org/8165

4e8aaa4d

Feb 04, 2013

Allowing for IPv4 'a' lines in router status entries · c6a9cde0

Damian Johnson authored Feb 03, 2013

Karsten reports on ticket #8036 that IPv4 addresses are indeed allowed on a
router status entry's 'a' line. This is a little unfortunate since it means a
less friendly attribute but not a big whoop.

c6a9cde0

Try to make minor descriptor versions clearer. · b8baf77c
Karsten Loesing authored Feb 04, 2013 and Damian Johnson committed Feb 03, 2013

b8baf77c

Using port lists for addresses_v6 rather than ranges · 6c99a28e

Damian Johnson authored Feb 03, 2013

Huh, I wonder where I got the idea that 'a' lines had port ranges. Dropping
that. According to the spec the 'a' lines should be parsed in the same way as
'or-address'. However, I suspect that the spec is a little off here - checking
if it can contain IPv4 addresses...

Caught by Karsten on...

https://trac.torproject.org/8036

6c99a28e

Renaming check_whitespace.py to static_checks.py · d44018a5

Damian Johnson authored Feb 03, 2013

The check_whitespace.py module no longer... well, checks whitespace. Rather, it
has become a dumping ground for all of the static checks that we do. Renaming
it to something more appropriate.

d44018a5

Feb 03, 2013

Providing alternative methods for parsing a NetworkStatusDocument · ea0b73a5

Damian Johnson authored Feb 03, 2013

Adding support in both the DescriptorReader and parse_file() function for three
ways of parsing network status documents...

a. Provide the router status entries (ie. the current behavior).

b. Provide the document itself with the router status entries that it contains.
This has the biggest cost in terms of upfront parsing time and memory usage,
but provides the caller with everything they might want.

c. Provide the document but skip reading the router status entries. A handy
option of you just care about the document's header/footer.

Now that we have these capability I'm further simplifying the descriptor API a
bit. The network status docs encouraged users to use the NetworkStatusDocument
constructors to achieve option 'b' above, but now that it's in the reader and
parse_file() there's no reason for them to do that.

Users should now *always* use either the DescriptorReader or parse_file(). If
they don't then they're off the reservation.

ea0b73a5

Dropping Version.meets_requirements() in favour of comparisons · 479f5356

Damian Johnson authored Feb 03, 2013

Once upon a time you checked your requirements via simple comparisons...

  if my_version >= requirement:
    ... do stuff...

I reluctantly changed this to a meets_requirements() method when we added the
VersionRequirements class since it was no longer simple comparisons the __cmp__
method could handle. However, now that we're using rich comparison operators we
can go back to the nicer style of comparisons. Apologies for any confusion
this back-and-forth has caused.

479f5356

More succinct python 3 warning for parse_file() · 30146d77

Damian Johnson authored Feb 03, 2013

Read speeds and universal newline translation are both addressed by reading in
binary mode. There's no need to have a separate warning for each.

30146d77

Feb 02, 2013

Using binary mode for the controller socket file · 9cd4c9fe

Damian Johnson authored Feb 02, 2013

Yay! Now that I have a version of python 3 that doesn't segfault I can finish
making our integ tests work.

The socket file used for controller connections should be normalized to use
binary mode. This is its behavior in python 2.x, and in 3.x having it in text
mode can cause sadness.

Exception in thread Tor Listener:
Traceback (most recent call last):
  File "/home/atagar/Desktop/Python-3.3.0/Lib/threading.py", line 639, in _bootstrap_inner
    self.run()
  File "/home/atagar/Desktop/Python-3.3.0/Lib/threading.py", line 596, in run
    self._target(*self._args, **self._kwargs)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/control.py", line 573, in _reader_loop
    control_message = self._socket.recv()
  File "/home/atagar/Desktop/stem/test/data/python3/stem/socket.py", line 115, in recv
    return recv_message(socket_file)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/socket.py", line 539, in recv_message
    line = control_file.readline()
  File "/home/atagar/Desktop/Python-3.3.0/Lib/codecs.py", line 300, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 2005: invalid continuation byte

After addressing this and a few encoding issues the controller integ tests now
pass, but after we're done testing python spews out a dump following...

*** glibc detected *** python3: munmap_chunk(): invalid pointer: 0x097f1620 ***

At this point I'm pretty well persuaded that the python 3.x series leaves
something to be desired in terms of stability.

9cd4c9fe

Python 3.x support · 3930f1f1

Damian Johnson authored Feb 02, 2013

Adding support for the python 3.x series. You can install the python 3 version
of stem by running...

python3 setup.py install

The 2to3 conversion can be tested trough run_tests.py with the '--python3'
argument. It passes all of the unit tests and the integ tests... er, don't
technically fail. However, python 3.2 has a bug causing a segfault when it gets
to the BaseController integ tests. Filed a ticket about it...

http://bugs.python.org/issue17105

However, stem's descriptor functionality checks out and this issue has likely
been addressed in later python releases so there's little point to hold off on
merging.

Ticket for python 3 support...

https://trac.torproject.org/7843

3930f1f1

Converting cookie auth token to unicode · 2b2a645a

Damian Johnson authored Feb 01, 2013

Well, this is dumb. Making a formatted string with ascii bytes includes the b''
wrapper in python 3. This broke our authentication calls, making calls like...

AUTHENTICATE b'd55e81eb9c3a1e22a2db919ec2efd22df4aeb88ee0ab3d10e64dbb2450d06921'

Converting the token to unicode to avoid this.

======================================================================
ERROR: test_authenticate_cookie
----------------------------------------------------------------------
Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 400, in _check_auth
    stem.connection.authenticate_cookie(control_socket, auth_arg)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/connection.py", line 604, in authenticate_cookie
    raise CookieAuthRejected(str(auth_response), cookie_path, False, auth_response)
stem.connection.CookieAuthRejected: Invalid hexadecimal encoding.  Maybe you tried a plain text password?  If so, the standard requires that you put it in double quotes.

During handling of the above exception, another exception occurred:

Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 303, in test_authenticate_cookie
    self._check_auth(auth_type, auth_value)
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 411, in _check_auth
    failure_msg = _get_auth_failure_message(auth_type)
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 99, in _get_auth_failure_message
    raise ValueError("No methods of authentication. If this is an open socket then auth shouldn't fail.")
ValueError: No methods of authentication. If this is an open socket then auth shouldn't fail.

2b2a645a

Replacing file() with open() · 384411b2

Damian Johnson authored Feb 01, 2013

I'm not sure why we were using file() at one point rather than open(), but it
makes python 3 sad...

======================================================================
ERROR: test_authenticate_cookie
----------------------------------------------------------------------
Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 303, in test_authenticate_cookie
    self._check_auth(auth_type, auth_value)
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 400, in _check_auth
    stem.connection.authenticate_cookie(control_socket, auth_arg)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/connection.py", line 583, in authenticate_cookie
    cookie_data = _read_cookie(cookie_path, False)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/connection.py", line 877, in _read_cookie
    with file(cookie_path, 'rb', 0) as f:
NameError: global name 'file' is not defined

384411b2

Accounting for ascii/unicode for network status documents · a047a74a

Damian Johnson authored Feb 01, 2013

Woohoo! Last descriptor type. Unlike the other descriptor types callers are
encouraged to sometimes use our NetworkStatusDocument classes directly so
swapping the input to unicode if we get ascii.

With this all of the descriptor integ tests now pass with python 3!

a047a74a

Pydoc missing version from descriptor type listing · 284797ac
Damian Johnson authored Feb 01, 2013
```
The 'network-status-microdesc-consensus-3' listing was missing the '1.0'.
```
284797ac

Using stem.descriptor.parse_file() for extrainfo integ tests · 4815afe5

Damian Johnson authored Jan 31, 2013

Going through parse_file() so we do the proper unicode conversion.

======================================================================
ERROR: test_cached_descriptor
----------------------------------------------------------------------
Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/extrainfo_descriptor.py", line 150, in test_cached_descriptor
    for desc in stem.descriptor.extrainfo_descriptor._parse_file(descriptor_file):
  File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/extrainfo_descriptor.py", line 155, in _parse_file
    extrainfo_content = stem.descriptor._read_until_keywords("router-signature", descriptor_file)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/__init__.py", line 350, in _read_until_keywords
    line_match = KEYWORD_LINE.match(line)
TypeError: can't use a string pattern on a bytes-like object

4815afe5

Using binary mode when reading descriptors · 3afa4346

Damian Johnson authored Jan 31, 2013

Now Damian, repleat after me: text mode is bad.

In python 2.x text mode and binary mode seem to be indistinguishable, but in
python 3 there's one tiny little difference: text mode is around 33x slower.
The integ test that read the cached-consensus took over five minutes (by
comparison to ten seconds with python 2.7), and in one case simply hung for
twenty minutes before I killed it.

I'm not aware of any disadvantage to using binary mode, so opting for that.

3afa4346

Skipping deletion of pyc in __pycache__ · d9d46cb5

Damian Johnson authored Jan 31, 2013

I disabled the deletion of orphaned pyc files when testing python 3 but on
reflection that wasn't enough. Python 2.x test runs still delete the python 3
bytecode. Changing the orphaned check to skip those files.

d9d46cb5

Fixing server descriptor test expecting unicode · bec2e972

Damian Johnson authored Jan 31, 2013

One of the server descriptor integ tests had a failing assertion because the
expected text was ASCII bytes and the descriptor content was unicode. Fixing
the test and moving the to_unicode helper to str_tools where it belongs.

======================================================================
FAIL: test_non_ascii_descriptor
----------------------------------------------------------------------
Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/server_descriptor.py", line 221, in test_non_ascii_descriptor
    self.assertEquals(expected_contact, desc.contact)
AssertionError: '2048R/F171EC1F Johan BlÃ¥bÃ¤ck ã\x81\x93ã\x82\x93ã\x81«ã\x81¡ã\x81¯' != '2048R/F171EC1F Johan Blåbäck こんにちは'
- 2048R/F171EC1F Johan BlÃ¥bÃ¤ck ããã«ã¡ã¯
+ 2048R/F171EC1F Johan Blåbäck こんにちは

bec2e972

Checking that to_bytes has unicode before converting · f4ee5ab3

Damian Johnson authored Jan 30, 2013

Adding a check to the to_bytes() helper so we don't attempt to convert ASCII
bytes to ASCII bytes (which doesn't work so well).

======================================================================
ERROR: test_old_descriptor
----------------------------------------------------------------------
Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/server_descriptor.py", line 120, in test_old_descriptor
    desc = stem.descriptor.server_descriptor.RelayDescriptor(descriptor_contents)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/server_descriptor.py", line 642, in __init__
    self._validate_content()
  File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/server_descriptor.py", line 687, in _validate_content
    key_der_as_hash = hashlib.sha1(stem.util.str_tools.to_bytes(key_as_bytes)).hexdigest()
  File "/home/atagar/Desktop/stem/test/data/python3/stem/util/str_tools.py", line 73, in to_bytes
    return _to_bytes(msg)
  File "/home/atagar/Desktop/stem/test/data/python3/stem/util/str_tools.py", line 54, in _to_bytes
    return codecs.latin_1_encode(msg)[0]
TypeError: Can't convert 'bytes' object to str implicitly

f4ee5ab3

Changing is_python_2* prereq checks to include python 3 · 46f1bfb9

Damian Johnson authored Jan 30, 2013

The is_python_26 and is_python_27 were checking if we were 2.6-2.x or 2.7-2.x.
On reflection it makes more sense for these to be '2.y and above' checks rather
than '2.y and above in the 2.x series'.

46f1bfb9

Providing ASCII bytes to hashlib.sha1() · 6c855996

Damian Johnson authored Jan 30, 2013

Another unicode/ASCII bytes conversion issue...

======================================================================
ERROR: test_metrics_descriptor
----------------------------------------------------------------------
Traceback:
  File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/server_descriptor.py", line 89, in test_metrics_descriptor
    self.assertEquals("2C7B27BEAB04B4E2459D89CA6D5CD1CC5F95A689", desc.digest())
  File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/server_descriptor.py", line 666, in digest
    digest_hash = hashlib.sha1(for_digest)
TypeError: Unicode-objects must be encoded before hashing

6c855996

Skipping newline translation for descriptor integ tests · 9ff618ee

Damian Johnson authored Jan 30, 2013

Using a custom open() call for python 3's integ tests to prevent newline
translation (and the resulting test failures).

9ff618ee

Checking for 2to3 and python3 when needed · 8a3cfb57

Damian Johnson authored Jan 30, 2013

Warning the user if 2to3 or python3 aren't in our PATH when the user provides
the '--python3' testing argument.

8a3cfb57

Skip universal newline translation in descriptor reader · 0288267b

Damian Johnson authored Jan 30, 2013

Python 3 introduces universal newline translation, converting '\n', '\r', and
'\r\n' into the local system's newline style. This is a really neat feature and
will solve many-a-headaches... but not for us. We conform to the tor spec which
specifies when CRLF appears verses other newline types.

Universal newline translation broke our ability to read the
'cr_in_contact_line' example which has multiple '\r' within a contact line
(https://trac.torproject.org/5637). Fixing the reader to disable newline
translation and adding a warning to our parse_file() pydocs.

0288267b

Adding --python3 to the run_tests.py help output · d0f4a0c4
Damian Johnson authored Jan 30, 2013

d0f4a0c4