Skip to content
Snippets Groups Projects
Commit 8ec17d80 authored by Damian Johnson's avatar Damian Johnson
Browse files

Better error message if file objects aren't seekable

As reported by teor, stem.descriptor's parse_file() function cannot accept
stdin...

  https://trac.torproject.org/projects/tor/ticket/23859

The trouble is that not all file objects in python are seekable. I'd *like*
to handle address this transparently by buffering the content...

  try:
    descriptor_file.tell()
  except IOError:
    # file's not seekable, wrapping in a buffer that is

    descriptor_file = io.BytesIO(descriptor_file.read())

This works great if our stream has content...

  % cat my_descriptors | python demo.py

*But* hangs indefinitely if no EOF is present in the stream.

  % python demo.py   <= hangs

Turns out non-blocking, platform independent reading of streams like stdin is
pretty tricky...

  http://eyalarubas.com/python-subproc-nonblock.html

As such simply providing callers with a more descriptive exception. If they
know their stream won't block *they* can add the above wrapper to provide
us with a seekable file object.
parent d42e9e19
No related branches found
No related tags found
No related merge requests found
......@@ -93,6 +93,13 @@ __all__ = [
'Descriptor',
]
UNSEEKABLE_MSG = """\
File object isn't seekable. Try wrapping it with a BytesIO instead...
content = my_file.read()
parsed_descriptors = stem.descriptor.parse_file(io.BytesIO(content))
"""
KEYWORD_CHAR = 'a-zA-Z0-9-'
WHITESPACE = ' \t'
KEYWORD_LINE = re.compile('^([%s]+)(?:[%s]+(.*))?$' % (KEYWORD_CHAR, WHITESPACE))
......@@ -218,6 +225,16 @@ def parse_file(descriptor_file, descriptor_type = None, validate = False, docume
return
# Not all files are seekable. If unseekable then advising the user.
#
# Python 3.x adds an io.seekable() method, but not an option with python 2.x
# so using an experimental call to tell() to determine this.
try:
descriptor_file.tell()
except IOError:
raise IOError(UNSEEKABLE_MSG)
# The tor descriptor specifications do not provide a reliable method for
# identifying a descriptor file's type and version so we need to guess
# based on its filename. Metrics descriptors, however, can be identified
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment