Skip to content
Snippets Groups Projects
Commit 1eeefc46 authored by Karsten Loesing's avatar Karsten Loesing
Browse files

Temp commit: add first draft of exit list spec.

parent e7da8f63
Branches
No related tags found
No related merge requests found
Tor exit list document format, version 2
-1. Editing
Let's use as little "markup" as possible while editing this document, which would include indentation or line breaks. Double newlines are probably okay. We can still make it pretty at the end, but let's first agree on the content.
0. Scope and preliminaries
This document defines the Tor exit list document format as written by Tor exit list scanners.
1. Document meta-format
Exit lists follow the same document meta-format as Tor descriptors.
The highest level object is a Document, which consists of one or more Items. Every Item begins with a KeywordLine, followed by zero or more Objects. A KeywordLine begins with a Keyword, optionally followed by whitespace and more non-newline characters, and ends with a newline. A Keyword is a sequence of one or more characters in the set [A-Za-z0-9-]. An Object is a block of encoded data in pseudo-Privacy-Enhanced-Mail (PEM) style format: that is, lines of encoded data MAY be wrapped by inserting an ascii linefeed ("LF", also called newline, or "NL" here) character (cf. RFC 4648 §3.1). When line wrapping, implementations MUST wrap lines at 64 characters. Upon decoding, implementations MUST ignore and discard all linefeed characters.
More formally:
NL = The ascii LF character (hex value 0x0a).
Document ::= (Item | NL)+
Item ::= KeywordLine Object*
KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL
Keyword = KeywordChar+
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
ArgumentChar ::= any printing ASCII character except NL.
WS = (SP | TAB)+
Object ::= BeginLine Base64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword "-----" NL
EndLine ::= "-----END " Keyword "-----" NL
A Keyword may not be "-----BEGIN".
The BeginLine and EndLine of an Object must use the same keyword.
When interpreting a Document, software MUST ignore any KeywordLine that starts with a keyword it doesn't recognize; future implementations MUST NOT require current clients to understand any KeywordLine not currently described.
In our document descriptions below, we tag Items with a multiplicity in brackets. Possible tags are:
"At start, exactly once": These items MUST occur in every instance of the document type, and MUST appear exactly once, and MUST be the first item in their documents.
"Exactly once": These items MUST occur exactly one time in every instance of the document type.
"At end, exactly once": These items MUST occur in every instance of the document type, and MUST appear exactly once, and MUST be the last item in their documents.
"At most once": These items MAY occur zero or one times in any instance of the document type, but MUST NOT occur more than once.
"Any number": These items MAY occur zero, one, or more times in any instance of the document type.
"Once or more": These items MUST occur at least once in any instance of the document type, and MAY occur more.
For forward compatibility, each item MUST allow extra arguments at the end of the line unless otherwise noted. Whenever an item DOES NOT allow extra arguments, we will tag it with "no extra arguments".
2. Document header
"ExitList" SP version NL
[At most once in version 1.]
[At start, exactly once in version 2 or later.]
Version of this document format.
"ScannerIdentity" SP identity NL
Identity of the exit scanner host. Identity is a fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in hex) for a router's identity key.
"ScannerAddress" SP address NL
"ScannerAddress6" SP address NL
[Any number.]
Address of the exit scanner host, which is an IPv4 address, represented as a dotted quad ("ScannerAddress" line), or an IPv6 address, surrounded by square brackets ("ScannerAddress6" line).
"ScannerLocation" SP country SP asn NL
[At most once.]
Location of this exit scanner host. Country is loosely based on ISO 3166-1 alpha-2 with possible extensions added by whichever data source is used for resolving the host's IP address. asn is the autonomous system number.
"ScannerContact" SP contact NL
[At most once.]
A human-readable string describing a way to contact the exit scanner's administrator, preferably including an email address and a PGP key fingerprint.
"ScannerSoftware" SP software NL
[At most once.]
A human-readable string describing the name and version of the software that created this exit list.
"Created" SP YYYY-MM-DD SP HH:MM:SS NL
[At most once.]
The time, in UTC, when this exit list was generated. The software generating this exit list may use its own schedule for generating exit lists. Typically, it would generate a new list every hour, but this is not required.
"Downloaded" SP YYYY-MM-DD SP HH:MM:SS NL
[At most once.]
The time, in UTC, when this exit list was downloaded from the exit scanner. Only included by the software downloading the exit list if it doesn't already include a "Created" timestamp.
3. Document body
The document body of an exit list contains zero or more exit list entries. Multiplicities refer to the exit list entry.
"ExitNode" SP identity NL
[At start, exactly once.]
Identity is a fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in hex) for a router's identity key.
"Published" SP YYYY-MM-DD SP HH:MM:SS NL
[Exactly once.]
The time, in UTC, when the last known descriptor was published by the router. The software may use this timestamp to decide not to perform another test until a newer descriptor arrives.
"LastStatus" SP YYYY-MM-DD SP HH:MM:SS NL
[Exactly once.]
The time, in UTC, when the software last received a network status update for this router. This time typically does not match statuses' publication or valid-after time. The software may use this timestamp to decide when to discard a router.
"ExitAddress" SP address SP YYYY-MM-DD SP HH:MM:SS NL
"ExitAddress6" SP address SP YYYY-MM-DD SP HH:MM:SS NL
[Once or more.]
An address used by the router as exit address and the time, in UTC, when this address was last seen in an exit scan. Address can be an IPv4 address, represented as a dotted quad ("ExitAddress" line), or an IPv6 address, surrounded by square brackets ("ExitAddress6" line).
4. Document footer
empty
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment