Skip to content
  1. Feb 18, 2013
  2. Feb 17, 2013
    • Damian Johnson's avatar
      Providing a string when str() is called on descriptors · 1a099106
      Damian Johnson authored
      Python 2.x gets pretty confused when an object's __str__ method provides a
      unicode string. Calling...
      
      >>> str(desc)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      UnicodeEncodeError: 'ascii' codec can't encode character u'\xab' in position 28: ordinal not in range(128)
      
      Providing an ascii str in python 2.x and unicode str in python 3.x. Thanks to
      Sathyanarayanan for the catch!
      1a099106
    • Damian Johnson's avatar
      Adding copyright headers · d70a2d39
      Damian Johnson authored
      Uggg I hate IP law. As pointed out by Juan on...
      
      https://trac.torproject.org/7954
      
      ... we need copyright headers to properly comply with the requirements of being
      under the LGPL. I'm not looking forward to keeping this up to date (and likely
      won't), but oh well.
      
      I've added the header to all python source files except the unit and integ
      tests (patches welcome if someone wants to spend the time adding those too).
      These headers...
      
      * Declare a copyright from the year of the file's creation to now (2013).
      
      * Include Sean if he worked on it (he is the only contributor that hasn't made
        his contributions public domain to avoid copyright headaches).
      d70a2d39
    • Damian Johnson's avatar
      Cleaning up our TODO comments · 4a2e7e74
      Damian Johnson authored
      Several of our TODO comments were no longer relevant or could be expanded.
      4a2e7e74
    • Damian Johnson's avatar
      Treat descriptor archive contents as individual files · 6927e68d
      Damian Johnson authored
      When the descriptor reader encountered an archive and read non-descriptor
      content it stopped reading. This has caused me almost two weeks of headaches in
      troubleshooting...
      
      https://trac.torproject.org/8049
      
      Changing the reader's behaviour to instead handle each file within the archive
      separately. Thanks to Karsten for catching this!
      6927e68d
  3. Feb 16, 2013
  4. Feb 09, 2013
  5. Feb 08, 2013
  6. Feb 06, 2013
  7. Feb 05, 2013
  8. Feb 04, 2013
  9. Feb 03, 2013
    • Damian Johnson's avatar
      Providing alternative methods for parsing a NetworkStatusDocument · ea0b73a5
      Damian Johnson authored
      Adding support in both the DescriptorReader and parse_file() function for three
      ways of parsing network status documents...
      
      a. Provide the router status entries (ie. the current behavior).
      
      b. Provide the document itself with the router status entries that it contains.
         This has the biggest cost in terms of upfront parsing time and memory usage,
         but provides the caller with everything they might want.
      
      c. Provide the document but skip reading the router status entries. A handy
         option of you just care about the document's header/footer.
      
      Now that we have these capability I'm further simplifying the descriptor API a
      bit. The network status docs encouraged users to use the NetworkStatusDocument
      constructors to achieve option 'b' above, but now that it's in the reader and
      parse_file() there's no reason for them to do that.
      
      Users should now *always* use either the DescriptorReader or parse_file(). If
      they don't then they're off the reservation.
      ea0b73a5
    • Damian Johnson's avatar
      Dropping Version.meets_requirements() in favour of comparisons · 479f5356
      Damian Johnson authored
      Once upon a time you checked your requirements via simple comparisons...
      
        if my_version >= requirement:
          ... do stuff...
      
      I reluctantly changed this to a meets_requirements() method when we added the
      VersionRequirements class since it was no longer simple comparisons the __cmp__
      method could handle. However, now that we're using rich comparison operators we
      can go back to the nicer style of comparisons. Apologies for any confusion
      this back-and-forth has caused.
      479f5356
    • Damian Johnson's avatar
      More succinct python 3 warning for parse_file() · 30146d77
      Damian Johnson authored
      Read speeds and universal newline translation are both addressed by reading in
      binary mode. There's no need to have a separate warning for each.
      30146d77
  10. Feb 02, 2013
    • Damian Johnson's avatar
      Using binary mode for the controller socket file · 9cd4c9fe
      Damian Johnson authored
      Yay! Now that I have a version of python 3 that doesn't segfault I can finish
      making our integ tests work.
      
      The socket file used for controller connections should be normalized to use
      binary mode. This is its behavior in python 2.x, and in 3.x having it in text
      mode can cause sadness.
      
      Exception in thread Tor Listener:
      Traceback (most recent call last):
        File "/home/atagar/Desktop/Python-3.3.0/Lib/threading.py", line 639, in _bootstrap_inner
          self.run()
        File "/home/atagar/Desktop/Python-3.3.0/Lib/threading.py", line 596, in run
          self._target(*self._args, **self._kwargs)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/control.py", line 573, in _reader_loop
          control_message = self._socket.recv()
        File "/home/atagar/Desktop/stem/test/data/python3/stem/socket.py", line 115, in recv
          return recv_message(socket_file)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/socket.py", line 539, in recv_message
          line = control_file.readline()
        File "/home/atagar/Desktop/Python-3.3.0/Lib/codecs.py", line 300, in decode
          (result, consumed) = self._buffer_decode(data, self.errors, final)
      UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 2005: invalid continuation byte
      
      After addressing this and a few encoding issues the controller integ tests now
      pass, but after we're done testing python spews out a dump following...
      
      *** glibc detected *** python3: munmap_chunk(): invalid pointer: 0x097f1620 ***
      
      At this point I'm pretty well persuaded that the python 3.x series leaves
      something to be desired in terms of stability.
      9cd4c9fe
    • Damian Johnson's avatar
      Python 3.x support · 3930f1f1
      Damian Johnson authored
      Adding support for the python 3.x series. You can install the python 3 version
      of stem by running...
      
      python3 setup.py install
      
      The 2to3 conversion can be tested trough run_tests.py with the '--python3'
      argument. It passes all of the unit tests and the integ tests... er, don't
      technically fail. However, python 3.2 has a bug causing a segfault when it gets
      to the BaseController integ tests. Filed a ticket about it...
      
      http://bugs.python.org/issue17105
      
      However, stem's descriptor functionality checks out and this issue has likely
      been addressed in later python releases so there's little point to hold off on
      merging.
      
      Ticket for python 3 support...
      
      https://trac.torproject.org/7843
      3930f1f1
    • Damian Johnson's avatar
      Converting cookie auth token to unicode · 2b2a645a
      Damian Johnson authored
      Well, this is dumb. Making a formatted string with ascii bytes includes the b''
      wrapper in python 3. This broke our authentication calls, making calls like...
      
      AUTHENTICATE b'd55e81eb9c3a1e22a2db919ec2efd22df4aeb88ee0ab3d10e64dbb2450d06921'
      
      Converting the token to unicode to avoid this.
      
      ======================================================================
      ERROR: test_authenticate_cookie
      ----------------------------------------------------------------------
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 400, in _check_auth
          stem.connection.authenticate_cookie(control_socket, auth_arg)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/connection.py", line 604, in authenticate_cookie
          raise CookieAuthRejected(str(auth_response), cookie_path, False, auth_response)
      stem.connection.CookieAuthRejected: Invalid hexadecimal encoding.  Maybe you tried a plain text password?  If so, the standard requires that you put it in double quotes.
      
      During handling of the above exception, another exception occurred:
      
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 303, in test_authenticate_cookie
          self._check_auth(auth_type, auth_value)
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 411, in _check_auth
          failure_msg = _get_auth_failure_message(auth_type)
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 99, in _get_auth_failure_message
          raise ValueError("No methods of authentication. If this is an open socket then auth shouldn't fail.")
      ValueError: No methods of authentication. If this is an open socket then auth shouldn't fail.
      2b2a645a
    • Damian Johnson's avatar
      Replacing file() with open() · 384411b2
      Damian Johnson authored
      I'm not sure why we were using file() at one point rather than open(), but it
      makes python 3 sad...
      
      ======================================================================
      ERROR: test_authenticate_cookie
      ----------------------------------------------------------------------
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 303, in test_authenticate_cookie
          self._check_auth(auth_type, auth_value)
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/connection/authentication.py", line 400, in _check_auth
          stem.connection.authenticate_cookie(control_socket, auth_arg)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/connection.py", line 583, in authenticate_cookie
          cookie_data = _read_cookie(cookie_path, False)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/connection.py", line 877, in _read_cookie
          with file(cookie_path, 'rb', 0) as f:
      NameError: global name 'file' is not defined
      384411b2
    • Damian Johnson's avatar
      Accounting for ascii/unicode for network status documents · a047a74a
      Damian Johnson authored
      Woohoo! Last descriptor type. Unlike the other descriptor types callers are
      encouraged to sometimes use our NetworkStatusDocument classes directly so
      swapping the input to unicode if we get ascii.
      
      With this all of the descriptor integ tests now pass with python 3!
      a047a74a
    • Damian Johnson's avatar
      Pydoc missing version from descriptor type listing · 284797ac
      Damian Johnson authored
      The 'network-status-microdesc-consensus-3' listing was missing the '1.0'.
      284797ac
    • Damian Johnson's avatar
      Using stem.descriptor.parse_file() for extrainfo integ tests · 4815afe5
      Damian Johnson authored
      Going through parse_file() so we do the proper unicode conversion.
      
      ======================================================================
      ERROR: test_cached_descriptor
      ----------------------------------------------------------------------
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/extrainfo_descriptor.py", line 150, in test_cached_descriptor
          for desc in stem.descriptor.extrainfo_descriptor._parse_file(descriptor_file):
        File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/extrainfo_descriptor.py", line 155, in _parse_file
          extrainfo_content = stem.descriptor._read_until_keywords("router-signature", descriptor_file)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/__init__.py", line 350, in _read_until_keywords
          line_match = KEYWORD_LINE.match(line)
      TypeError: can't use a string pattern on a bytes-like object
      4815afe5
    • Damian Johnson's avatar
      Using binary mode when reading descriptors · 3afa4346
      Damian Johnson authored
      Now Damian, repleat after me: text mode is bad.
      
      In python 2.x text mode and binary mode seem to be indistinguishable, but in
      python 3 there's one tiny little difference: text mode is around 33x slower.
      The integ test that read the cached-consensus took over five minutes (by
      comparison to ten seconds with python 2.7), and in one case simply hung for
      twenty minutes before I killed it.
      
      I'm not aware of any disadvantage to using binary mode, so opting for that.
      3afa4346
    • Damian Johnson's avatar
      Skipping deletion of pyc in __pycache__ · d9d46cb5
      Damian Johnson authored
      I disabled the deletion of orphaned pyc files when testing python 3 but on
      reflection that wasn't enough. Python 2.x test runs still delete the python 3
      bytecode. Changing the orphaned check to skip those files.
      d9d46cb5
    • Damian Johnson's avatar
      Fixing server descriptor test expecting unicode · bec2e972
      Damian Johnson authored
      One of the server descriptor integ tests had a failing assertion because the
      expected text was ASCII bytes and the descriptor content was unicode. Fixing
      the test and moving the to_unicode helper to str_tools where it belongs.
      
      ======================================================================
      FAIL: test_non_ascii_descriptor
      ----------------------------------------------------------------------
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/server_descriptor.py", line 221, in test_non_ascii_descriptor
          self.assertEquals(expected_contact, desc.contact)
      AssertionError: '2048R/F171EC1F Johan BlÃ¥bäck ã\x81\x93ã\x82\x93ã\x81«ã\x81¡ã\x81¯' != '2048R/F171EC1F Johan Blåbäck こんにちは'
      - 2048R/F171EC1F Johan Blåbäck こんにちは
      + 2048R/F171EC1F Johan Blåbäck こんにちは
      bec2e972
    • Damian Johnson's avatar
      Checking that to_bytes has unicode before converting · f4ee5ab3
      Damian Johnson authored
      Adding a check to the to_bytes() helper so we don't attempt to convert ASCII
      bytes to ASCII bytes (which doesn't work so well).
      
      ======================================================================
      ERROR: test_old_descriptor
      ----------------------------------------------------------------------
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/server_descriptor.py", line 120, in test_old_descriptor
          desc = stem.descriptor.server_descriptor.RelayDescriptor(descriptor_contents)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/server_descriptor.py", line 642, in __init__
          self._validate_content()
        File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/server_descriptor.py", line 687, in _validate_content
          key_der_as_hash = hashlib.sha1(stem.util.str_tools.to_bytes(key_as_bytes)).hexdigest()
        File "/home/atagar/Desktop/stem/test/data/python3/stem/util/str_tools.py", line 73, in to_bytes
          return _to_bytes(msg)
        File "/home/atagar/Desktop/stem/test/data/python3/stem/util/str_tools.py", line 54, in _to_bytes
          return codecs.latin_1_encode(msg)[0]
      TypeError: Can't convert 'bytes' object to str implicitly
      f4ee5ab3
    • Damian Johnson's avatar
      Changing is_python_2* prereq checks to include python 3 · 46f1bfb9
      Damian Johnson authored
      The is_python_26 and is_python_27 were checking if we were 2.6-2.x or 2.7-2.x.
      On reflection it makes more sense for these to be '2.y and above' checks rather
      than '2.y and above in the 2.x series'.
      46f1bfb9
    • Damian Johnson's avatar
      Providing ASCII bytes to hashlib.sha1() · 6c855996
      Damian Johnson authored
      Another unicode/ASCII bytes conversion issue...
      
      ======================================================================
      ERROR: test_metrics_descriptor
      ----------------------------------------------------------------------
      Traceback:
        File "/home/atagar/Desktop/stem/test/data/python3/test/integ/descriptor/server_descriptor.py", line 89, in test_metrics_descriptor
          self.assertEquals("2C7B27BEAB04B4E2459D89CA6D5CD1CC5F95A689", desc.digest())
        File "/home/atagar/Desktop/stem/test/data/python3/stem/descriptor/server_descriptor.py", line 666, in digest
          digest_hash = hashlib.sha1(for_digest)
      TypeError: Unicode-objects must be encoded before hashing
      6c855996
    • Damian Johnson's avatar
      Skipping newline translation for descriptor integ tests · 9ff618ee
      Damian Johnson authored
      Using a custom open() call for python 3's integ tests to prevent newline
      translation (and the resulting test failures).
      9ff618ee
    • Damian Johnson's avatar
      Checking for 2to3 and python3 when needed · 8a3cfb57
      Damian Johnson authored
      Warning the user if 2to3 or python3 aren't in our PATH when the user provides
      the '--python3' testing argument.
      8a3cfb57
    • Damian Johnson's avatar
      Skip universal newline translation in descriptor reader · 0288267b
      Damian Johnson authored
      Python 3 introduces universal newline translation, converting '\n', '\r', and
      '\r\n' into the local system's newline style. This is a really neat feature and
      will solve many-a-headaches... but not for us. We conform to the tor spec which
      specifies when CRLF appears verses other newline types.
      
      Universal newline translation broke our ability to read the
      'cr_in_contact_line' example which has multiple '\r' within a contact line
      (https://trac.torproject.org/5637). Fixing the reader to disable newline
      translation and adding a warning to our parse_file() pydocs.
      0288267b
    • Damian Johnson's avatar
      d0f4a0c4