Commit 27a654e2 authored Mar 26, 2016 by Damian Johnson

Explicit non-ascii content validation

Few hours ago a relay started publishing a malformed extrainfo descriptor...

https://trac.torproject.org/projects/tor/ticket/18656

This is fine, bugs happen. This is why we check for malformed content. But
this one cost me a few hours since the non-ascii content then caused DocTor to
choke, providing me a useless stacktrace...

Traceback (most recent call last):
File "./descriptor_checker.py", line 99, in <module>
main()
File "./descriptor_checker.py", line 57, in main
log.warn("Unable to retrieve the %s: %s" % (descriptor_type, query.error))
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 76: ordinal not in range(128)

Instead now it provides...

source: http://86.59.21.38:80/tor/extra/all.z
time: 03/26/2016 16:30
error: 'dirreq-v3-reqs' line had non-ascii content: S?=4026597208,S?=4026597208,S?=4026597208,S?=4026597208,S?=4026597208,S?=4026597208,??=4026591624,6?=4026537520,6?=4026537520,6?=4026537520,us=8

Non-ascii strings are toxic for systems they're in. They break printing,
logging, exception handling, and anything else that touches them unless they're
handled specially.

Changing Stem to explicitly validate that content is ascii, and provide the
user with escaped strings in those exceptions.

parent 50f94029

No related branches found

No related tags found

No related merge requests found

Show whitespace changes

Inline Side-by-side

Showing with 58 additions and 5 deletions

Please register or to comment