Merge branch 'maint-0.2.2' into release-0.2.2

This commit is contained in:
Roger Dingledine 2011-02-25 13:48:06 -05:00
commit 999c3c97d2
128 changed files with 377 additions and 24815 deletions

16
changes/1863_bwhist Normal file
View File

@ -0,0 +1,16 @@
o Minor features
- Servers now save observed maximum bandwidth throughput rates
to their state file (along with total usage, which was already
saved) so that they can determine their correct estimated
bandwidth on restart. Resolves bug 1863, where Tor servers
would reset their estimated bandwidth to 0 after restarting.
o Minor bugfixes
- Fix a bug in banwidth history state parsing that could have been
triggered if a future version of Tor ever changed the timing
granularity at which bandwidth history is measured. Bugfix on
Tor 0.1.1.11-alpha.
- Correctly clear out dir_read/dir_write history when there is an
error parsing any bw history value from the state file. Bugfix on
Tor 0.2.2.15-alpha.

8
changes/bug2366 Normal file
View File

@ -0,0 +1,8 @@
o Minor bugfixes
- When a relay decides that its DNS is too broken for it to serve
as an exit server, it advertised itself as a non-exit, but
continued to act as an exit. This could create accidental
partitioning opportunities for users. Instead, if a relay is
going to advertise reject *:* as its exit policy, it should
really act with exit policy "reject *:*". Fixes bug 2366.
Bugfix on Tor 0.1.2.5-alpha. Bugfix by user "postman" on trac.

6
changes/bug2403 Normal file
View File

@ -0,0 +1,6 @@
o Minor bugfixes:
- In the special case where you configure a public exit relay as your
bridge, Tor would be willing to use that exit relay as the last
hop in your circuit as well. Now we fail that circuit instead.
Bugfix on 0.2.0.12-alpha. Fixes bug 2403. Reported by "piebeer".

5
changes/bug2504 Normal file
View File

@ -0,0 +1,5 @@
o Minor bugfixes:
- Fix a bug with our locking implementation on windows that couldn't
correctly detect when a file was already locked. Fixes bug 2504,
bugfix on 0.2.1.6-alpha.

5
changes/bug2572 Normal file
View File

@ -0,0 +1,5 @@
o Minor bugfixes:
- Don't crash a bridge authority on SIGHUP if it can't force itself
into its routerlist. Fixes bug 2572.

12
changes/log_domains Normal file
View File

@ -0,0 +1,12 @@
o Minor features
- Make it simpler to specify "All log domains except for A and B".
Previously you needed to say "[*,~A,~B]". Now you can just say
"[~A,~B]".
- Add a LogMessageDomains option to include the domains of log messages
along with the messages. Without this, there's no way to use
log domains without reading the source or doing a lot of guessing
o Documentation
- Add documentation for configuring logging at different severities in
different log domains. We've had this feature since 0.2.1.1-alpha, but
for some reason it never made it into the manpage. Fixes bug 2215.

5
changes/torspec.git Normal file
View File

@ -0,0 +1,5 @@
o Packaging changes:
- Stop shipping the Tor specs files and development proposal documents
in the tarball. They are now in a separate git repository at
git://git.torproject.org/torspec.git

View File

@ -946,7 +946,7 @@ fi
CPPFLAGS="$CPPFLAGS $TOR_CPPFLAGS_libevent $TOR_CPPFLAGS_openssl $TOR_CPPFLAGS_zlib"
AC_CONFIG_FILES([Makefile tor.spec Doxyfile contrib/tor.sh contrib/torctl contrib/torify contrib/tor.logrotate contrib/Makefile src/config/torrc.sample src/Makefile doc/Makefile doc/spec/Makefile src/config/Makefile src/common/Makefile src/or/Makefile src/test/Makefile src/win32/Makefile src/tools/Makefile contrib/suse/Makefile contrib/suse/tor.sh])
AC_CONFIG_FILES([Makefile tor.spec Doxyfile contrib/tor.sh contrib/torctl contrib/torify contrib/tor.logrotate contrib/Makefile src/config/torrc.sample src/Makefile doc/Makefile src/config/Makefile src/common/Makefile src/or/Makefile src/test/Makefile src/win32/Makefile src/tools/Makefile contrib/suse/Makefile contrib/suse/tor.sh])
AC_OUTPUT
if test -x /usr/bin/perl && test -x ./contrib/updateVersions.pl ; then

View File

@ -1,4 +1,3 @@
# We use a two-step process to generate documentation from asciidoc files.
#
# First, we use asciidoc/a2x to process the asciidoc files into .1.in and
@ -32,16 +31,12 @@ endif
EXTRA_DIST = HACKING asciidoc-helper.sh \
$(html_in) $(man_in) $(txt_in) \
tor-rpm-creation.txt \
tor-win32-mingw-creation.txt
tor-win32-mingw-creation.txt spec/README
docdir = @docdir@
asciidoc_product = $(nodist_man_MANS) $(doc_DATA)
SUBDIRS = spec
DIST_SUBDIRS = spec
# Generate the html documentation from asciidoc, but don't do
# machine-specific replacements yet
$(html_in) :

View File

@ -1,5 +0,0 @@
EXTRA_DIST = tor-spec.txt rend-spec.txt control-spec.txt \
dir-spec.txt socks-extensions.txt path-spec.txt \
version-spec.txt address-spec.txt bridges-spec.txt

11
doc/spec/README Normal file
View File

@ -0,0 +1,11 @@
The Tor specifications and proposals have moved to a new repository.
To browse the specifications, go to
https://gitweb.torproject.org/torspec.git/tree
To check out the specification repository, run
git clone git://git.torproject.org/torspec.git
For other information on the repository, see
https://gitweb.torproject.org/torspec.git

View File

@ -1,58 +0,0 @@
Special Hostnames in Tor
Nick Mathewson
1. Overview
Most of the time, Tor treats user-specified hostnames as opaque: When
the user connects to www.torproject.org, Tor picks an exit node and uses
that node to connect to "www.torproject.org". Some hostnames, however,
can be used to override Tor's default behavior and circuit-building
rules.
These hostnames can be passed to Tor as the address part of a SOCKS4a or
SOCKS5 request. If the application is connected to Tor using an IP-only
method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be
substituted for certain IP addresses using the MapAddress configuration
option or the MAPADDRESS control command.
2. .exit
SYNTAX: [hostname].[name-or-digest].exit
[name-or-digest].exit
Hostname is a valid hostname; [name-or-digest] is either the nickname of a
Tor node or the hex-encoded digest of that node's public key.
When Tor sees an address in this format, it uses the specified hostname as
the exit node. If no "hostname" component is given, Tor defaults to the
published IPv4 address of the exit node.
It is valid to try to resolve hostnames, and in fact upon success Tor
will cache an internal mapaddress of the form
"www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent
lookups.
The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due
to potential application-level attacks.
EXAMPLES:
www.example.com.exampletornode.exit
Connect to www.example.com from the node called "exampletornode".
exampletornode.exit
Connect to the published IP address of "exampletornode" using
"exampletornode" as the exit.
3. .onion
SYNTAX: [digest].onion
The digest is the first eighty bits of a SHA1 hash of the identity key for
a hidden service, encoded in base32.
When Tor sees an address in this format, it tries to look up and connect to
the specified hidden service. See rend-spec.txt for full details.

View File

@ -1,249 +0,0 @@
Tor bridges specification
0. Preface
This document describes the design decisions around support for bridge
users, bridge relays, and bridge authorities. It acts as an overview
of the bridge design and deployment for developers, and it also tries
to point out limitations in the current design and implementation.
For more details on what all of these mean, look at blocking.tex in
/doc/design-paper/
1. Bridge relays
Bridge relays are just like normal Tor relays except they don't publish
their server descriptors to the main directory authorities.
1.1. PublishServerDescriptor
To configure your relay to be a bridge relay, just add
BridgeRelay 1
PublishServerDescriptor bridge
to your torrc. This will cause your relay to publish its descriptor
to the bridge authorities rather than to the default authorities.
Alternatively, you can say
BridgeRelay 1
PublishServerDescriptor 0
which will cause your relay to not publish anywhere. This could be
useful for private bridges.
1.2. Recommendations.
Bridge relays should use an exit policy of "reject *:*". This is
because they only need to relay traffic between the bridge users
and the rest of the Tor network, so there's no need to let people
exit directly from them.
We invented the RelayBandwidth* options for this situation: Tor clients
who want to allow relaying too. See proposal 111 for details. Relay
operators should feel free to rate-limit their relayed traffic.
1.3. Implementation note.
Vidalia 0.0.15 has turned its "Relay" settings page into a tri-state
"Don't relay" / "Relay for the Tor network" / "Help censored users".
If you click the third choice, it forces your exit policy to reject *:*.
If all the bridges end up on port 9001, that's not so good. On the
other hand, putting the bridges on a low-numbered port in the Unix
world requires jumping through extra hoops. The current compromise is
that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
other platforms.
At the bottom of the relay config settings window, Vidalia displays
the bridge identifier to the operator (see Section 3.1) so he can pass
it on to bridge users.
2. Bridge authorities.
Bridge authorities are like normal v3 directory authorities, except
they don't create their own network-status documents or votes. So if
you ask a bridge authority for a network-status document or consensus,
they behave like a directory mirror: they give you one from one of
the main authorities. But if you ask the bridge authority for the
descriptor corresponding to a particular identity fingerprint, it will
happily give you the latest descriptor for that fingerprint.
To become a bridge authority, add these lines to your torrc:
AuthoritativeDirectory 1
BridgeAuthoritativeDir 1
Right now there's one bridge authority, running on the Tonga relay.
2.1. Exporting bridge-purpose descriptors
We've added a new purpose for server descriptors: the "bridge"
purpose. With the new router-descriptors file format that includes
annotations, it's easy to look through it and find the bridge-purpose
descriptors.
Currently we export the bridge descriptors from Tonga to the
BridgeDB server, so it can give them out according to the policies
in blocking.pdf.
2.2. Reachability/uptime testing
Right now the bridge authorities do active reachability testing of
bridges, so we know which ones to recommend for users.
But in the design document, we suggested that bridges should publish
anonymously (i.e. via Tor) to the bridge authority, so somebody watching
the bridge authority can't just enumerate all the bridges. But if we're
doing active measurement, the game is up. Perhaps we should back off on
this goal, or perhaps we should do our active measurement anonymously?
Answering this issue is scheduled for 0.2.1.x.
2.3. Future work: migrating to multiple bridge authorities
Having only one bridge authority is both a trust bottleneck (if you
break into one place you learn about every single bridge we've got)
and a robustness bottleneck (when it's down, bridge users become sad).
Right now if we put up a second bridge authority, all the bridges would
publish to it, and (assuming the code works) bridge users would query
a random bridge authority. This resolves the robustness bottleneck,
but makes the trust bottleneck even worse.
In 0.2.2.x and later we should think about better ways to have multiple
bridge authorities.
3. Bridge users.
Bridge users are like ordinary Tor users except they use encrypted
directory connections by default, and they use bridge relays as both
entry guards (their first hop) and directory guards (the source of
all their directory information).
To become a bridge user, add the following line to your torrc:
UseBridges 1
and then add at least one "Bridge" line to your torrc based on the
format below.
3.1. Format of the bridge identifier.
The canonical format for a bridge identifier contains an IP address,
an ORPort, and an identity fingerprint:
bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
However, the identity fingerprint can be left out, in which case the
bridge user will connect to that relay and use it as a bridge regardless
of what identity key it presents:
bridge 128.31.0.34:9009
This might be useful for cases where only short bridge identifiers
can be communicated to bridge users.
In a future version we may also support bridge identifiers that are
only a key fingerprint:
bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
and the bridge user can fetch the latest descriptor from the bridge
authority (see Section 3.4).
3.2. Bridges as entry guards
For now, bridge users add their bridge relays to their list of "entry
guards" (see path-spec.txt for background on entry guards). They are
managed by the entry guard algorithms exactly as if they were a normal
entry guard -- their keys and timing get cached in the "state" file,
etc. This means that when the Tor user starts up with "UseBridges"
disabled, he will skip past the bridge entries since they won't be
listed as up and usable in his networkstatus consensus. But to be clear,
the "entry_guards" list doesn't currently distinguish guards by purpose.
Internally, each bridge user keeps a smartlist of "bridge_info_t"
that reflects the "bridge" lines from his torrc along with a download
schedule (see Section 3.5 below). When he starts Tor, he attempts
to fetch a descriptor for each configured bridge (see Section 3.4
below). When he succeeds at getting a descriptor for one of the bridges
in his list, he adds it directly to the entry guard list using the
normal add_an_entry_guard() interface. Once a bridge descriptor has
been added, should_delay_dir_fetches() will stop delaying further
directory fetches, and the user begins to bootstrap his directory
information from that bridge (see Section 3.3).
Currently bridge users cache their bridge descriptors to the
"cached-descriptors" file (annotated with purpose "bridge"), but
they don't make any attempt to reuse descriptors they find in this
file. The theory is that either the bridge is available now, in which
case you can get a fresh descriptor, or it's not, in which case an
old descriptor won't do you much good.
We could disable writing out the bridge lines to the state file, if
we think this is a problem.
As an exception, if we get an application request when we have one
or more bridge descriptors but we believe none of them are running,
we mark them all as running again. This is similar to the exception
already in place to help long-idle Tor clients realize they should
fetch fresh directory information rather than just refuse requests.
3.3. Bridges as directory guards
In addition to using bridges as the first hop in their circuits, bridge
users also use them to fetch directory updates. Other than initial
bootstrapping to find a working bridge descriptor (see Section 3.4
below), all further non-anonymized directory fetches will be redirected
to the bridge.
This means that bridge relays need to have cached answers for all
questions the bridge user might ask. This makes the upgrade path
tricky --- for example, if we migrate to a v4 directory design, the
bridge user would need to keep using v3 so long as his bridge relays
only knew how to answer v3 queries.
In a future design, for cases where the user has enough information
to build circuits yet the chosen bridge doesn't know how to answer a
given query, we might teach bridge users to make an anonymized request
to a more suitable directory server.
3.4. How bridge users get their bridge descriptor
Bridge users can fetch bridge descriptors in two ways: by going directly
to the bridge and asking for "/tor/server/authority", or by going to
the bridge authority and asking for "/tor/server/fp/ID". By default,
they will only try the direct queries. If the user sets
UpdateBridgesFromAuthority 1
in his config file, then he will try querying the bridge authority
first for bridges where he knows a digest (if he only knows an IP
address and ORPort, then his only option is a direct query).
If the user has at least one working bridge, then he will do further
queries to the bridge authority through a full three-hop Tor circuit.
But when bootstrapping, he will make a direct begin_dir-style connection
to the bridge authority.
As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
from the bridge authority and it returns a 404 not found, the user
will automatically fall back to trying a direct query. Therefore it is
recommended that bridge users always set UpdateBridgesFromAuthority,
since at worst it will delay their fetches a little bit and notify
the bridge authority of the identity fingerprint (but not location)
of their intended bridges.
3.5. Bridge descriptor retry schedule
Bridge users try to fetch a descriptor for each bridge (using the
steps in Section 3.4 above) on startup. Whenever they receive a
bridge descriptor, they reschedule a new descriptor download for 1
hour from then.
If on the other hand it fails, they try again after 15 minutes for the
first attempt, after 15 minutes for the second attempt, and after 60
minutes for subsequent attempts.
In 0.2.2.x we should come up with some smarter retry schedules.
3.6. Implementation note.
Vidalia 0.1.0 has a new checkbox in its Network config window called
"My ISP blocks connections to the Tor network." Users who click that
box change their configuration to:
UseBridges 1
UpdateBridgesFromAuthority 1
and should add at least one bridge identifier.

View File

@ -1,498 +0,0 @@
TC: A Tor control protocol (Version 0)
-1. Deprecation
THIS PROTOCOL IS DEPRECATED. It is still documented here because Tor
0.1.1.x happens to support much of it; but the support for v0 is not
maintained, so you should expect it to rot in unpredictable ways. Support
for v0 will be removed some time after Tor 0.1.2.
0. Scope
This document describes an implementation-specific protocol that is used
for other programs (such as frontend user-interfaces) to communicate
with a locally running Tor process. It is not part of the Tor onion
routing protocol.
We're trying to be pretty extensible here, but not infinitely
forward-compatible.
1. Protocol outline
TC is a bidirectional message-based protocol. It assumes an underlying
stream for communication between a controlling process (the "client") and
a Tor process (the "server"). The stream may be implemented via TCP,
TLS-over-TCP, a Unix-domain socket, or so on, but it must provide
reliable in-order delivery. For security, the stream should not be
accessible by untrusted parties.
In TC, the client and server send typed variable-length messages to each
other over the underlying stream. By default, all messages from the server
are in response to messages from the client. Some client requests, however,
will cause the server to send messages to the client indefinitely far into
the future.
Servers respond to messages in the order they're received.
2. Message format
The messages take the following format:
Length [2 octets; big-endian]
Type [2 octets; big-endian]
Body [Length octets]
Upon encountering a recognized Type, implementations behave as described in
section 3 below. If the type is not recognized, servers respond with an
"ERROR" message (code UNRECOGNIZED; see 3.1 below), and clients simply ignore
the message.
2.1. Types and encodings
All numbers are given in big-endian (network) order.
OR identities are given in hexadecimal, in the same format as identity key
fingerprints, but without spaces; see tor-spec.txt for more information.
3. Message types
Message types are drawn from the following ranges:
0x0000-0xEFFF : Reserved for use by official versions of this spec.
0xF000-0xFFFF : Unallocated; usable by unofficial extensions.
3.1. ERROR (Type 0x0000)
Sent in response to a message that could not be processed as requested.
The body of the message begins with a 2-byte error code. The following
values are defined:
0x0000 Unspecified error
[]
0x0001 Internal error
[Something went wrong inside Tor, so that the client's
request couldn't be fulfilled.]
0x0002 Unrecognized message type
[The client sent a message type we don't understand.]
0x0003 Syntax error
[The client sent a message body in a format we can't parse.]
0x0004 Unrecognized configuration key
[The client tried to get or set a configuration option we don't
recognize.]
0x0005 Invalid configuration value
[The client tried to set a configuration option to an
incorrect, ill-formed, or impossible value.]
0x0006 Unrecognized byte code
[The client tried to set a byte code (in the body) that
we don't recognize.]
0x0007 Unauthorized.
[The client tried to send a command that requires
authorization, but it hasn't sent a valid AUTHENTICATE
message.]
0x0008 Failed authentication attempt
[The client sent a well-formed authorization message.]
0x0009 Resource exhausted
[The server didn't have enough of a given resource to
fulfill a given request.]
0x000A No such stream
0x000B No such circuit
0x000C No such OR
The rest of the body should be a human-readable description of the error.
In general, new error codes should only be added when they don't fall under
one of the existing error codes.
3.2. DONE (Type 0x0001)
Sent from server to client in response to a request that was successfully
completed, with no more information needed. The body is usually empty but
may contain a message.
3.3. SETCONF (Type 0x0002)
Change the value of a configuration variable. The body contains a list of
newline-terminated key-value configuration lines. An individual key-value
configuration line consists of the key, followed by a space, followed by
the value. The server behaves as though it had just read the key-value pair
in its configuration file.
The server responds with a DONE message on success, or an ERROR message on
failure.
When a configuration options takes multiple values, or when multiple
configuration keys form a context-sensitive group (see below), then
setting _any_ of the options in a SETCONF command is taken to reset all of
the others. For example, if two ORBindAddress values are configured,
and a SETCONF command arrives containing a single ORBindAddress value, the
new command's value replaces the two old values.
To _remove_ all settings for a given option entirely (and go back to its
default value), send a single line containing the key and no value.
3.4. GETCONF (Type 0x0003)
Request the value of a configuration variable. The body contains one or
more NL-terminated strings for configuration keys. The server replies
with a CONFVALUE message.
If an option appears multiple times in the configuration, all of its
key-value pairs are returned in order.
Some options are context-sensitive, and depend on other options with
different keywords. These cannot be fetched directly. Currently there
is only one such option: clients should use the "HiddenServiceOptions"
virtual keyword to get all HiddenServiceDir, HiddenServicePort,
HiddenServiceNodes, and HiddenServiceExcludeNodes option settings.
3.5. CONFVALUE (Type 0x0004)
Sent in response to a GETCONF message; contains a list of "Key Value\n"
(A non-whitespace keyword, a single space, a non-NL value, a NL)
strings.
3.6. SETEVENTS (Type 0x0005)
Request the server to inform the client about interesting events.
The body contains a list of 2-byte event codes (see "event" below).
Any events *not* listed in the SETEVENTS body are turned off; thus, sending
SETEVENTS with an empty body turns off all event reporting.
The server responds with a DONE message on success, and an ERROR message
if one of the event codes isn't recognized. (On error, the list of active
event codes isn't changed.)
3.7. EVENT (Type 0x0006)
Sent from the server to the client when an event has occurred and the
client has requested that kind of event. The body contains a 2-byte
event code followed by additional event-dependent information. Event
codes are:
0x0001 -- Circuit status changed
Status [1 octet]
0x00 Launched - circuit ID assigned to new circuit
0x01 Built - all hops finished, can now accept streams
0x02 Extended - one more hop has been completed
0x03 Failed - circuit closed (was not built)
0x04 Closed - circuit closed (was built)
Circuit ID [4 octets]
(Must be unique to Tor process/time)
Path [NUL-terminated comma-separated string]
(For extended/failed, is the portion of the path that is
built)
0x0002 -- Stream status changed
Status [1 octet]
(Sent connect=0,sent resolve=1,succeeded=2,failed=3,
closed=4, new connection=5, new resolve request=6,
stream detached from circuit and still retriable=7)
Stream ID [4 octets]
(Must be unique to Tor process/time)
Target (NUL-terminated address-port string]
0x0003 -- OR Connection status changed
Status [1 octet]
(Launched=0,connected=1,failed=2,closed=3)
OR nickname/identity [NUL-terminated]
0x0004 -- Bandwidth used in the last second
Bytes read [4 octets]
Bytes written [4 octets]
0x0005 -- Notice/warning/error occurred
Message [NUL-terminated]
<obsolete: use 0x0007-0x000B instead.>
0x0006 -- New descriptors available
OR List [NUL-terminated, comma-delimited list of
OR identity]
0x0007 -- Debug message occurred
0x0008 -- Info message occurred
0x0009 -- Notice message occurred
0x000A -- Warning message occurred
0x000B -- Error message occurred
Message [NUL-terminated]
3.8. AUTHENTICATE (Type 0x0007)
Sent from the client to the server. Contains a 'magic cookie' to prove
that client is really allowed to control this Tor process. The server
responds with DONE or ERROR.
The format of the 'cookie' is implementation-dependent; see 4.1 below for
information on how the standard Tor implementation handles it.
3.9. SAVECONF (Type 0x0008)
Sent from the client to the server. Instructs the server to write out
its config options into its torrc. Server returns DONE if successful, or
ERROR if it can't write the file or some other error occurs.
3.10. SIGNAL (Type 0x0009)
Sent from the client to the server. The body contains one byte that
indicates the action the client wishes the server to take.
1 (0x01) -- Reload: reload config items, refetch directory.
2 (0x02) -- Controlled shutdown: if server is an OP, exit immediately.
If it's an OR, close listeners and exit after 30 seconds.
10 (0x0A) -- Dump stats: log information about open connections and
circuits.
12 (0x0C) -- Debug: switch all open logs to loglevel debug.
15 (0x0F) -- Immediate shutdown: clean up and exit now.
The server responds with DONE if the signal is recognized (or simply
closes the socket if it was asked to close immediately), else ERROR.
3.11. MAPADDRESS (Type 0x000A)
Sent from the client to the server. The body contains a sequence of
address mappings, each consisting of the address to be mapped, a single
space, the replacement address, and a NL character.
Addresses may be IPv4 addresses, IPv6 addresses, or hostnames.
The client sends this message to the server in order to tell it that future
SOCKS requests for connections to the original address should be replaced
with connections to the specified replacement address. If the addresses
are well-formed, and the server is able to fulfill the request, the server
replies with a single DONE message containing the source and destination
addresses. If request is malformed, the server replies with a syntax error
message. The server can't fulfill the request, it replies with an internal
ERROR message.
The client may decline to provide a body for the original address, and
instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or
"." for hostname), signifying that the server should choose the original
address itself, and return that address in the DONE message. The server
should ensure that it returns an element of address space that is unlikely
to be in actual use. If there is already an address mapped to the
destination address, the server may reuse that mapping.
If the original address is already mapped to a different address, the old
mapping is removed. If the original address and the destination address
are the same, the server removes any mapping in place for the original
address.
{Note: This feature is designed to be used to help Tor-ify applications
that need to use SOCKS4 or hostname-less SOCKS5. There are three
approaches to doing this:
1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead.
2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS
feature) to resolve the hostname remotely. This doesn't work
with special addresses like x.onion or x.y.exit.
3. Use MAPADDRESS to map an IP address to the desired hostname, and then
arrange to fool the application into thinking that the hostname
has resolved to that IP.
This functionality is designed to help implement the 3rd approach.}
[XXXX When, if ever, can mappings expire? Should they expire?]
[XXXX What addresses, if any, are safe to use?]
3.12 GETINFO (Type 0x000B)
Sent from the client to the server. The message body is as for GETCONF:
one or more NL-terminated strings. The server replies with an INFOVALUE
message.
Unlike GETCONF, this message is used for data that are not stored in the
Tor configuration file, but instead.
Recognized key and their values include:
"version" -- The version of the server's software, including the name
of the software. (example: "Tor 0.0.9.4")
"desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest server
descriptor for a given OR, NUL-terminated. If no such OR is known, the
corresponding value is an empty string.
"network-status" -- a space-separated list of all known OR identities.
This is in the same format as the router-status line in directories;
see tor-spec.txt for details.
"addr-mappings/all"
"addr-mappings/config"
"addr-mappings/cache"
"addr-mappings/control" -- a NL-terminated list of address mappings, each
in the form of "from-address" SP "to-address". The 'config' key
returns those address mappings set in the configuration; the 'cache'
key returns the mappings in the client-side DNS cache; the 'control'
key returns the mappings set via the control interface; the 'all'
target returns the mappings set through any mechanism.
3.13 INFOVALUE (Type 0x000C)
Sent from the server to the client in response to a GETINFO message.
Contains one or more items of the format:
Key [(NUL-terminated string)]
Value [(NUL-terminated string)]
The keys match those given in the GETINFO message.
3.14 EXTENDCIRCUIT (Type 0x000D)
Sent from the client to the server. The message body contains two fields:
Circuit ID [4 octets]
Path [NUL-terminated, comma-delimited string of OR nickname/identity]
This request takes one of two forms: either the Circuit ID is zero, in
which case it is a request for the server to build a new circuit according
to the specified path, or the Circuit ID is nonzero, in which case it is a
request for the server to extend an existing circuit with that ID according
to the specified path.
If the request is successful, the server sends a DONE message containing
a message body consisting of the four-octet Circuit ID of the newly created
circuit.
3.15 ATTACHSTREAM (Type 0x000E)
Sent from the client to the server. The message body contains two fields:
Stream ID [4 octets]
Circuit ID [4 octets]
This message informs the server that the specified stream should be
associated with the specified circuit. Each stream may be associated with
at most one circuit, and multiple streams may share the same circuit.
Streams can only be attached to completed circuits (that is, circuits that
have sent a circuit status 'built' event).
If the circuit ID is 0, responsibility for attaching the given stream is
returned to Tor.
{Implementation note: By default, Tor automatically attaches streams to
circuits itself, unless the configuration variable
"__LeaveStreamsUnattached" is set to "1". Attempting to attach streams
via TC when "__LeaveStreamsUnattached" is false may cause a race between
Tor and the controller, as both attempt to attach streams to circuits.}
3.16 POSTDESCRIPTOR (Type 0x000F)
Sent from the client to the server. The message body contains one field:
Descriptor [NUL-terminated string]
This message informs the server about a new descriptor.
The descriptor, when parsed, must contain a number of well-specified
fields, including fields for its nickname and identity.
If there is an error in parsing the descriptor, the server must send an
appropriate error message. If the descriptor is well-formed but the server
chooses not to add it, it must reply with a DONE message whose body
explains why the server was not added.
3.17 FRAGMENTHEADER (Type 0x0010)
Sent in either direction. Used to encapsulate messages longer than 65535
bytes in length.
Underlying type [2 bytes]
Total Length [4 bytes]
Data [Rest of message]
A FRAGMENTHEADER message MUST be followed immediately by a number of
FRAGMENT messages, such that lengths of the "Data" fields of the
FRAGMENTHEADER and FRAGMENT messages add to the "Total Length" field of the
FRAGMENTHEADER message.
Implementations MUST NOT fragment messages of length less than 65536 bytes.
Implementations MUST be able to process fragmented messages that not
optimally packed.
3.18 FRAGMENT (Type 0x0011)
Data [Entire message]
See FRAGMENTHEADER for more information
3.19 REDIRECTSTREAM (Type 0x0012)
Sent from the client to the server. The message body contains two fields:
Stream ID [4 octets]
Address [variable-length, NUL-terminated.]
Tells the server to change the exit address on the specified stream. No
remapping is performed on the new provided address.
To be sure that the modified address will be used, this event must be sent
after a new stream event is received, and before attaching this stream to
a circuit.
3.20 CLOSESTREAM (Type 0x0013)
Sent from the client to the server. The message body contains three
fields:
Stream ID [4 octets]
Reason [1 octet]
Flags [1 octet]
Tells the server to close the specified stream. The reason should be
one of the Tor RELAY_END reasons given in tor-spec.txt. Flags is not
used currently. Tor may hold the stream open for a while to flush
any data that is pending.
3.21 CLOSECIRCUIT (Type 0x0014)
Sent from the client to the server. The message body contains two
fields:
Circuit ID [4 octets]
Flags [1 octet]
Tells the server to close the specified circuit. If the LSB of the flags
field is nonzero, do not close the circuit unless it is unused.
4. Implementation notes
4.1. Authentication
By default, the current Tor implementation trusts all local users.
If the 'CookieAuthentication' option is true, Tor writes a "magic cookie"
file named "control_auth_cookie" into its data directory. To authenticate,
the controller must send the contents of this file.
If the 'HashedControlPassword' option is set, it must contain the salted
hash of a secret password. The salted hash is computed according to the
S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier.
This is then encoded in hexadecimal, prefixed by the indicator sequence
"16:". Thus, for example, the password 'foo' could encode to:
16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2
++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
salt hashed value
indicator
You can generate the salt of a password by calling
'tor --hash-password <password>'
or by using the example code in the Python and Java controller libraries.
To authenticate under this scheme, the controller sends Tor the original
secret that was used to generate the password.
4.2. Don't let the buffer get too big.
If you ask for lots of events, and 16MB of them queue up on the buffer,
the Tor process will close the socket.

File diff suppressed because it is too large Load Diff

View File

@ -1,314 +0,0 @@
Tor Protocol Specification
Roger Dingledine
Nick Mathewson
0. Preliminaries
THIS SPECIFICATION IS OBSOLETE.
This document specifies the Tor directory protocol as used in version
0.1.0.x and earlier. See dir-spec.txt for a current version.
1. Basic operation
There is a small number of directory authorities, and a larger number of
caches. Client and servers know public keys for the directory authorities.
Tor servers periodically upload self-signed "router descriptors" to the
directory authorities. Each authority publishes a self-signed "directory"
(containing all the router descriptors it knows, and a statement on which
are running) and a self-signed "running routers" document containing only
the statement on which routers are running.
All Tors periodically download these documents, downloading the directory
less frequently than they do the "running routers" document. Clients
preferentially download from caches rather than authorities.
1.1. Document format
Router descriptors, directories, and running-routers documents all obey the
following lightweight extensible information format.
The highest level object is a Document, which consists of one or more
Items. Every Item begins with a KeywordLine, followed by one or more
Objects. A KeywordLine begins with a Keyword, optionally followed by
whitespace and more non-newline characters, and ends with a newline. A
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
An Object is a block of encoded data in pseudo-Open-PGP-style
armor. (cf. RFC 2440)
More formally:
Document ::= (Item | NL)+
Item ::= KeywordLine Object*
KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
Keyword = KeywordChar+
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
ArgumentChar ::= any printing ASCII character except NL.
WS = (SP | TAB)+
Object ::= BeginLine Base-64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword "-----" NL
EndLine ::= "-----END " Keyword "-----" NL
The BeginLine and EndLine of an Object must use the same keyword.
When interpreting a Document, software MUST reject any document containing a
KeywordLine that starts with a keyword it doesn't recognize.
The "opt" keyword is reserved for non-critical future extensions. All
implementations MUST ignore any item of the form "opt keyword ....." when
they would not recognize "keyword ....."; and MUST treat "opt keyword ....."
as synonymous with "keyword ......" when keyword is recognized.
2. Router descriptor format.
Every router descriptor MUST start with a "router" Item; MUST end with a
"router-signature" Item and an extra NL; and MUST contain exactly one
instance of each of the following Items: "published" "onion-key" "link-key"
"signing-key" "bandwidth". Additionally, a router descriptor MAY contain
any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items.
Other than "router" and "router-signature", the items may appear in any
order.
The items' formats are as follows:
"router" nickname address ORPort SocksPort DirPort
Indicates the beginning of a router descriptor. "address"
must be an IPv4 address in dotted-quad format. The last
three numbers indicate the TCP ports at which this OR exposes
functionality. ORPort is a port at which this OR accepts TLS
connections for the main OR protocol; SocksPort is deprecated and
should always be 0; and DirPort is the port at which this OR accepts
directory-related HTTP connections. If any port is not supported,
the value 0 is given instead of a port number.
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
Estimated bandwidth for this router, in bytes per second. The
"average" bandwidth is the volume per second that the OR is willing
to sustain over long periods; the "burst" bandwidth is the volume
that the OR is willing to sustain in very short intervals. The
"observed" value is an estimate of the capacity this server can
handle. The server remembers the max bandwidth sustained output
over any ten second period in the past day, and another sustained
input. The "observed" value is the lesser of these two numbers.
"platform" string
A human-readable string describing the system on which this OR is
running. This MAY include the operating system, and SHOULD include
the name and version of the software implementing the Tor protocol.
"published" YYYY-MM-DD HH:MM:SS
The time, in GMT, when this descriptor was generated.
"fingerprint"
A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded
in hex, with a single space after every 4 characters) for this router's
identity key. A descriptor is considered invalid (and MUST be
rejected) if the fingerprint line does not match the public key.
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
be marked with "opt" until earlier versions of Tor are obsolete.]
"hibernating" 0|1
If the value is 1, then the Tor server was hibernating when the
descriptor was published, and shouldn't be used to build circuits.
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
be marked with "opt" until earlier versions of Tor are obsolete.]
"uptime"
The number of seconds that this OR process has been running.
"onion-key" NL a public key in PEM format
This key is used to encrypt EXTEND cells for this OR. The key MUST
be accepted for at least XXXX hours after any new key is published in
a subsequent descriptor.
"signing-key" NL a public key in PEM format
The OR's long-term identity key.
"accept" exitpattern
"reject" exitpattern
These lines, in order, describe the rules that an OR follows when
deciding whether to allow a new stream to a given address. The
'exitpattern' syntax is described below.
"router-signature" NL Signature NL
The "SIGNATURE" object contains a signature of the PKCS1-padded
hash of the entire router descriptor, taken from the beginning of the
"router" line, through the newline after the "router-signature" line.
The router descriptor is invalid unless the signature is performed
with the router's identity key.
"contact" info NL
Describes a way to contact the server's administrator, preferably
including an email address and a PGP key fingerprint.
"family" names NL
'Names' is a whitespace-separated list of server nicknames. If two ORs
list one another in their "family" entries, then OPs should treat them
as a single OR for the purpose of path selection.
For example, if node A's descriptor contains "family B", and node B's
descriptor contains "family A", then node A and node B should never
be used on the same circuit.
"read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
"write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
Declare how much bandwidth the OR has used recently. Usage is divided
into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines
the end of the most recent interval. The numbers are the number of
bytes used in the most recent intervals, ordered from oldest to newest.
[We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
be marked with "opt" until earlier versions of Tor are obsolete.]
2.1. Nonterminals in routerdescriptors
nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
exitpattern ::= addrspec ":" portspec
portspec ::= "*" | port | port "-" port
port ::= an integer between 1 and 65535, inclusive.
addrspec ::= "*" | ip4spec | ip6spec
ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
ip4 ::= an IPv4 address in dotted-quad format
ip4mask ::= an IPv4 mask in dotted-quad format
num_ip4_bits ::= an integer between 0 and 32
ip6spec ::= ip6 | ip6 "/" num_ip6_bits
ip6 ::= an IPv6 address, surrounded by square brackets.
num_ip6_bits ::= an integer between 0 and 128
Ports are required; if they are not included in the router
line, they must appear in the "ports" lines.
3. Directory format
A Directory begins with a "signed-directory" item, followed by one each of
the following, in any order: "recommended-software", "published",
"router-status", "dir-signing-key". It may include any number of "opt"
items. After these items, a directory includes any number of router
descriptors, and a single "directory-signature" item.
"signed-directory"
Indicates the start of a directory.
"published" YYYY-MM-DD HH:MM:SS
The time at which this directory was generated and signed, in GMT.
"dir-signing-key"
The key used to sign this directory; see "signing-key" for format.
"recommended-software" comma-separated-version-list
A list of which versions of which implementations are currently
believed to be secure and compatible with the network.
"running-routers" whitespace-separated-list
A description of which routers are currently believed to be up or
down. Every entry consists of an optional "!", followed by either an
OR's nickname, or "$" followed by a hexadecimal encoding of the hash
of an OR's identity key. If the "!" is included, the router is
believed not to be running; otherwise, it is believed to be running.
If a router's nickname is given, exactly one router of that nickname
will appear in the directory, and that router is "approved" by the
directory server. If a hashed identity key is given, that OR is not
"approved". [XXXX The 'running-routers' line is only provided for
backward compatibility. New code should parse 'router-status'
instead.]
"router-status" whitespace-separated-list
A description of which routers are currently believed to be up or
down, and which are verified or unverified. Contains one entry for
every router that the directory server knows. Each entry is of the
format:
!name=$digest [Verified router, currently not live.]
name=$digest [Verified router, currently live.]
!$digest [Unverified router, currently not live.]
or $digest [Unverified router, currently live.]
(where 'name' is the router's nickname and 'digest' is a hexadecimal
encoding of the hash of the routers' identity key).
When parsing this line, clients should only mark a router as
'verified' if its nickname AND digest match the one provided.
"directory-signature" nickname-of-dirserver NL Signature
The signature is computed by computing the digest of the
directory, from the characters "signed-directory", through the newline
after "directory-signature". This digest is then padded with PKCS.1,
and signed with the directory server's signing key.
If software encounters an unrecognized keyword in a single router descriptor,
it MUST reject only that router descriptor, and continue using the
others. Because this mechanism is used to add 'critical' extensions to
future versions of the router descriptor format, implementation should treat
it as a normal occurrence and not, for example, report it to the user as an
error. [Versions of Tor prior to 0.1.1 did this.]
If software encounters an unrecognized keyword in the directory header,
it SHOULD reject the entire directory.
4. Network-status descriptor
A "network-status" (a.k.a "running-routers") document is a truncated
directory that contains only the current status of a list of nodes, not
their actual descriptors. It contains exactly one of each of the following
entries.
"network-status"
Must appear first.
"published" YYYY-MM-DD HH:MM:SS
(see section 3 above)
"router-status" list
(see section 3 above)
"directory-signature" NL signature
(see section 3 above)
5. Behavior of a directory server
lists nodes that are connected currently
speaks HTTP on a socket, spits out directory on request
Directory servers listen on a certain port (the DirPort), and speak a
limited version of HTTP 1.0. Clients send either GET or POST commands.
The basic interactions are:
"%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n",
command, url, content-length, host.
Get "/tor/" to fetch a full directory.
Get "/tor/dir.z" to fetch a compressed full directory.
Get "/tor/running-routers" to fetch a network-status descriptor.
Post "/tor/" to post a server descriptor, with the body of the
request containing the descriptor.
"host" is used to specify the address:port of the dirserver, so
the request can survive going through HTTP proxies.

View File

@ -1,896 +0,0 @@
Tor directory protocol, version 2
0. Scope and preliminaries
This directory protocol is used by Tor version 0.1.1.x and 0.1.2.x. See
dir-spec-v1.txt for information on earlier versions, and dir-spec.txt
for information on later versions.
0.1. Goals and motivation
There were several problems with the way Tor handles directory information
in version 0.1.0.x and earlier. Here are the problems we try to fix with
this new design, already implemented in 0.1.1.x:
1. Directories were very large and use up a lot of bandwidth: clients
downloaded descriptors for all router several times an hour.
2. Every directory authority was a trust bottleneck: if a single
directory authority lied, it could make clients believe for a time an
arbitrarily distorted view of the Tor network.
3. Our current "verified server" system is kind of nonsensical.
4. Getting more directory authorities would add more points of failure
and worsen possible partitioning attacks.
There are two problems that remain unaddressed by this design.
5. Requiring every client to know about every router won't scale.
6. Requiring every directory cache to know every router won't scale.
We attempt to fix 1-4 here, and to build a solution that will work when we
figure out an answer for 5. We haven't thought at all about what to do
about 6.
1. Outline
There is a small set (say, around 10) of semi-trusted directory
authorities. A default list of authorities is shipped with the Tor
software. Users can change this list, but are encouraged not to do so, in
order to avoid partitioning attacks.
Routers periodically upload signed "descriptors" to the directory
authorities describing their keys, capabilities, and other information.
Routers may act as directory mirrors (also called "caches"), to reduce
load on the directory authorities. They announce this in their
descriptors.
Each directory authority periodically generates and signs a compact
"network status" document that lists that authority's view of the current
descriptors and status for known routers, but which does not include the
descriptors themselves.
Directory mirrors download, cache, and re-serve network-status documents
to clients.
Clients, directory mirrors, and directory authorities all use
network-status documents to find out when their list of routers is
out-of-date. If it is, they download any missing router descriptors.
Clients download missing descriptors from mirrors; mirrors and authorities
download from authorities. Descriptors are downloaded by the hash of the
descriptor, not by the server's identity key: this prevents servers from
attacking clients by giving them descriptors nobody else uses.
All directory information is uploaded and downloaded with HTTP.
Coordination among directory authorities is done client-side: clients
compute a vote-like algorithm among the network-status documents they
have, and base their decisions on the result.
1.1. What's different from 0.1.0.x?
Clients used to download a signed concatenated set of router descriptors
(called a "directory") from directory mirrors, regardless of which
descriptors had changed.
Between downloading directories, clients would download "network-status"
documents that would list which servers were supposed to running.
Clients would always believe the most recently published network-status
document they were served.
Routers used to upload fresh descriptors all the time, whether their keys
and other information had changed or not.
1.2. Document meta-format
Router descriptors, directories, and running-routers documents all obey the
following lightweight extensible information format.
The highest level object is a Document, which consists of one or more
Items. Every Item begins with a KeywordLine, followed by one or more
Objects. A KeywordLine begins with a Keyword, optionally followed by
whitespace and more non-newline characters, and ends with a newline. A
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
An Object is a block of encoded data in pseudo-Open-PGP-style
armor. (cf. RFC 2440)
More formally:
Document ::= (Item | NL)+
Item ::= KeywordLine Object*
KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
Keyword = KeywordChar+
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
ArgumentChar ::= any printing ASCII character except NL.
WS = (SP | TAB)+
Object ::= BeginLine Base-64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword "-----" NL
EndLine ::= "-----END " Keyword "-----" NL
The BeginLine and EndLine of an Object must use the same keyword.
When interpreting a Document, software MUST ignore any KeywordLine that
starts with a keyword it doesn't recognize; future implementations MUST NOT
require current clients to understand any KeywordLine not currently
described.
The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future
extensions. All implementations MUST ignore any item of the form "opt
keyword ....." when they would not recognize "keyword ....."; and MUST
treat "opt keyword ....." as synonymous with "keyword ......" when keyword
is recognized.
Implementations before 0.1.2.5-alpha rejected any document with a
KeywordLine that started with a keyword that they didn't recognize.
Implementations MUST prefix items not recognized by older versions of Tor
with an "opt" until those versions of Tor are obsolete.
Other implementations that want to extend Tor's directory format MAY
introduce their own items. The keywords for extension items SHOULD start
with the characters "x-" or "X-", to guarantee that they will not conflict
with keywords used by future versions of Tor.
2. Router operation
ORs SHOULD generate a new router descriptor whenever any of the
following events have occurred:
- A period of time (18 hrs by default) has passed since the last
time a descriptor was generated.
- A descriptor field other than bandwidth or uptime has changed.
- Bandwidth has changed by at least a factor of 2 from the last time a
descriptor was generated, and at least a given interval of time
(20 mins by default) has passed since then.
- Its uptime has been reset (by restarting).
After generating a descriptor, ORs upload it to every directory
authority they know, by posting it to the URL
http://<hostname:port>/tor/
2.1. Router descriptor format
Every router descriptor MUST start with a "router" Item; MUST end with a
"router-signature" Item and an extra NL; and MUST contain exactly one
instance of each of the following Items: "published" "onion-key"
"signing-key" "bandwidth".
A router descriptor MAY have zero or one of each of the following Items,
but MUST NOT have more than one: "contact", "uptime", "fingerprint",
"hibernating", "read-history", "write-history", "eventdns", "platform",
"family".
Additionally, a router descriptor MAY contain any number of "accept",
"reject", and "opt" Items. Other than "router" and "router-signature",
the items may appear in any order.
The items' formats are as follows:
"router" nickname address ORPort SocksPort DirPort
Indicates the beginning of a router descriptor. "address" must be an
IPv4 address in dotted-quad format. The last three numbers indicate
the TCP ports at which this OR exposes functionality. ORPort is a port
at which this OR accepts TLS connections for the main OR protocol;
SocksPort is deprecated and should always be 0; and DirPort is the
port at which this OR accepts directory-related HTTP connections. If
any port is not supported, the value 0 is given instead of a port
number.
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
Estimated bandwidth for this router, in bytes per second. The
"average" bandwidth is the volume per second that the OR is willing to
sustain over long periods; the "burst" bandwidth is the volume that
the OR is willing to sustain in very short intervals. The "observed"
value is an estimate of the capacity this server can handle. The
server remembers the max bandwidth sustained output over any ten
second period in the past day, and another sustained input. The
"observed" value is the lesser of these two numbers.
"platform" string
A human-readable string describing the system on which this OR is
running. This MAY include the operating system, and SHOULD include
the name and version of the software implementing the Tor protocol.
"published" YYYY-MM-DD HH:MM:SS
The time, in GMT, when this descriptor was generated.
"fingerprint"
A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in
hex, with a single space after every 4 characters) for this router's
identity key. A descriptor is considered invalid (and MUST be
rejected) if the fingerprint line does not match the public key.
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
be marked with "opt" until earlier versions of Tor are obsolete.]
"hibernating" 0|1
If the value is 1, then the Tor server was hibernating when the
descriptor was published, and shouldn't be used to build circuits.
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should be
marked with "opt" until earlier versions of Tor are obsolete.]
"uptime"
The number of seconds that this OR process has been running.
"onion-key" NL a public key in PEM format
This key is used to encrypt EXTEND cells for this OR. The key MUST be
accepted for at least 1 week after any new key is published in a
subsequent descriptor.
"signing-key" NL a public key in PEM format
The OR's long-term identity key.
"accept" exitpattern
"reject" exitpattern
These lines describe the rules that an OR follows when
deciding whether to allow a new stream to a given address. The
'exitpattern' syntax is described below. The rules are considered in
order; if no rule matches, the address will be accepted. For clarity,
the last such entry SHOULD be accept *:* or reject *:*.
"router-signature" NL Signature NL
The "SIGNATURE" object contains a signature of the PKCS1-padded
hash of the entire router descriptor, taken from the beginning of the
"router" line, through the newline after the "router-signature" line.
The router descriptor is invalid unless the signature is performed
with the router's identity key.
"contact" info NL
Describes a way to contact the server's administrator, preferably
including an email address and a PGP key fingerprint.
"family" names NL
'Names' is a space-separated list of server nicknames or
hexdigests. If two ORs list one another in their "family" entries,
then OPs should treat them as a single OR for the purpose of path
selection.
For example, if node A's descriptor contains "family B", and node B's
descriptor contains "family A", then node A and node B should never
be used on the same circuit.
"read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
"write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
Declare how much bandwidth the OR has used recently. Usage is divided
into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field
defines the end of the most recent interval. The numbers are the
number of bytes used in the most recent intervals, ordered from
oldest to newest.
[We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
be marked with "opt" until earlier versions of Tor are obsolete.]
"eventdns" bool NL
Declare whether this version of Tor is using the newer enhanced
dns logic. Versions of Tor without eventdns SHOULD NOT be used for
reverse hostname lookups.
[All versions of Tor before 0.1.2.2-alpha should be assumed to have
this option set to 0 if it is not present. All Tor versions at
0.1.2.2-alpha or later should be assumed to have this option set to
1 if it is not present. Until 0.1.2.1-alpha-dev, this option was
not generated, even when eventdns was in use. Versions of Tor
before 0.1.2.1-alpha-dev did not parse this option, so it should be
marked "opt". With 0.2.0.1-alpha, the old 'dnsworker' logic has
been removed, rendering this option of historical interest only.]
2.2. Nonterminals in router descriptors
nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
hexdigest ::= a '$', followed by 20 hexadecimal characters.
[Represents a server by the digest of its identity key.]
exitpattern ::= addrspec ":" portspec
portspec ::= "*" | port | port "-" port
port ::= an integer between 1 and 65535, inclusive.
[Some implementations incorrectly generate ports with value 0.
Implementations SHOULD accept this, and SHOULD NOT generate it.]
addrspec ::= "*" | ip4spec | ip6spec
ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
ip4 ::= an IPv4 address in dotted-quad format
ip4mask ::= an IPv4 mask in dotted-quad format
num_ip4_bits ::= an integer between 0 and 32
ip6spec ::= ip6 | ip6 "/" num_ip6_bits
ip6 ::= an IPv6 address, surrounded by square brackets.
num_ip6_bits ::= an integer between 0 and 128
bool ::= "0" | "1"
Ports are required; if they are not included in the router
line, they must appear in the "ports" lines.
3. Network status format
Directory authorities generate, sign, and compress network-status
documents. Directory servers SHOULD generate a fresh network-status
document when the contents of such a document would be different from the
last one generated, and some time (at least one second, possibly longer)
has passed since the last one was generated.
The network status document contains a preamble, a set of router status
entries, and a signature, in that order.
We use the same meta-format as used for directories and router descriptors
in "tor-spec.txt". Implementations MAY insert blank lines
for clarity between sections; these blank lines are ignored.
Implementations MUST NOT depend on blank lines in any particular location.
As used here, "whitespace" is a sequence of 1 or more tab or space
characters.
The preamble contains:
"network-status-version" -- A document format version. For this
specification, the version is "2".
"dir-source" -- The authority's hostname, current IP address, and
directory port, all separated by whitespace.
"fingerprint" -- A base16-encoded hash of the signing key's
fingerprint, with no additional spaces added.
"contact" -- An arbitrary string describing how to contact the
directory server's administrator. Administrators should include at
least an email address and a PGP fingerprint.
"dir-signing-key" -- The directory server's public signing key.
"client-versions" -- A comma-separated list of recommended client
versions.
"server-versions" -- A comma-separated list of recommended server
versions.
"published" -- The publication time for this network-status object.
"dir-options" -- A set of flags, in any order, separated by whitespace:
"Names" if this directory authority performs name bindings.
"Versions" if this directory authority recommends software versions.
"BadExits" if the directory authority flags nodes that it believes
are performing incorrectly as exit nodes.
"BadDirectories" if the directory authority flags nodes that it
believes are performing incorrectly as directory caches.
The dir-options entry is optional. The "-versions" entries are required if
the "Versions" flag is present. The other entries are required and must
appear exactly once. The "network-status-version" entry must appear first;
the others may appear in any order. Implementations MUST ignore
additional arguments to the items above, and MUST ignore unrecognized
flags.
For each router, the router entry contains: (This format is designed for
conciseness.)
"r" -- followed by the following elements, in order, separated by
whitespace:
- The OR's nickname,
- A hash of its identity key, encoded in base64, with trailing =
signs removed.
- A hash of its most recent descriptor, encoded in base64, with
trailing = signs removed. (The hash is calculated as for
computing the signature of a descriptor.)
- The publication time of its most recent descriptor, in the form
YYYY-MM-DD HH:MM:SS, in GMT.
- An IP address
- An OR port
- A directory port (or "0" for none")
"s" -- A series of whitespace-separated status flags, in any order:
"Authority" if the router is a directory authority.
"BadExit" if the router is believed to be useless as an exit node
(because its ISP censors it, because it is behind a restrictive
proxy, or for some similar reason).
"BadDirectory" if the router is believed to be useless as a
directory cache (because its directory port isn't working,
its bandwidth is always throttled, or for some similar
reason).
"Exit" if the router is useful for building general-purpose exit
circuits.
"Fast" if the router is suitable for high-bandwidth circuits.
"Guard" if the router is suitable for use as an entry guard.
"Named" if the router's identity-nickname mapping is canonical,
and this authority binds names.
"Stable" if the router is suitable for long-lived circuits.
"Running" if the router is currently usable.
"Valid" if the router has been 'validated'.
"V2Dir" if the router implements this protocol.
"v" -- The version of the Tor protocol that this server is running. If
the value begins with "Tor" SP, the rest of the string is a Tor
version number, and the protocol is "The Tor protocol as supported
by the given version of Tor." Otherwise, if the value begins with
some other string, Tor has upgraded to a more sophisticated
protocol versioning system, and the protocol is "a version of the
Tor protocol more recent than any we recognize."
The "r" entry for each router must appear first and is required. The
"s" entry is optional (see Section 3.1 below for how the flags are
decided). Unrecognized flags on the "s" line and extra elements
on the "r" line must be ignored. The "v" line is optional; it was not
supported until 0.1.2.5-alpha, and it must be preceded with an "opt"
until all earlier versions of Tor are obsolete.
The signature section contains:
"directory-signature" nickname-of-dirserver NL Signature
Signature is a signature of this network-status document
(the document up until the signature, including the line
"directory-signature <nick>\n"), using the directory authority's
signing key.
We compress the network status list with zlib before transmitting it.
3.1. Establishing server status
(This section describes how directory authorities choose which status
flags to apply to routers, as of Tor 0.1.1.18-rc. Later directory
authorities MAY do things differently, so long as clients keep working
well. Clients MUST NOT depend on the exact behaviors in this section.)
In the below definitions, a router is considered "active" if it is
running, valid, and not hibernating.
"Valid" -- a router is 'Valid' if it is running a version of Tor not
known to be broken, and the directory authority has not blacklisted
it as suspicious.
"Named" -- Directory authority administrators may decide to support name
binding. If they do, then they must maintain a file of
nickname-to-identity-key mappings, and try to keep this file consistent
with other directory authorities. If they don't, they act as clients, and
report bindings made by other directory authorities (name X is bound to
identity Y if at least one binding directory lists it, and no directory
binds X to some other Y'.) A router is called 'Named' if the router
believes the given name should be bound to the given key.
"Running" -- A router is 'Running' if the authority managed to connect to
it successfully within the last 30 minutes.
"Stable" -- A router is 'Stable' if it is active, and either its
uptime is at least the median uptime for known active routers, or
its uptime is at least 30 days. Routers are never called stable if
they are running a version of Tor known to drop circuits stupidly.
(0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.)
"Fast" -- A router is 'Fast' if it is active, and its bandwidth is
in the top 7/8ths for known active routers.
"Guard" -- A router is a possible 'Guard' if it is 'Stable' and its
bandwidth is above median for known active routers. If the total
bandwidth of active non-BadExit Exit servers is less than one third
of the total bandwidth of all active servers, no Exit is listed as
a Guard.
"Authority" -- A router is called an 'Authority' if the authority
generating the network-status document believes it is an authority.
"V2Dir" -- A router supports the v2 directory protocol if it has an open
directory port, and it is running a version of the directory protocol that
supports the functionality clients need. (Currently, this is
0.1.1.9-alpha or later.)
Directory server administrators may label some servers or IPs as
blacklisted, and elect not to include them in their network-status lists.
Authorities SHOULD 'disable' any servers in excess of 3 on any single IP.
When there are more than 3 to choose from, authorities should first prefer
authorities to non-authorities, then prefer Running to non-Running, and
then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the
authority *should* advertise it without the Running or Valid flag.
Thus, the network-status list includes all non-blacklisted,
non-expired, non-superseded descriptors.
4. Directory server operation
All directory authorities and directory mirrors ("directory servers")
implement this section, except as noted.
4.1. Accepting uploads (authorities only)
When a router posts a signed descriptor to a directory authority, the
authority first checks whether it is well-formed and correctly
self-signed. If it is, the authority next verifies that the nickname
in question is not already assigned to a router with a different
public key.
Finally, the authority MAY check that the router is not blacklisted
because of its key, IP, or another reason.
If the descriptor passes these tests, and the authority does not already
have a descriptor for a router with this public key, it accepts the
descriptor and remembers it.
If the authority _does_ have a descriptor with the same public key, the
newly uploaded descriptor is remembered if its publication time is more
recent than the most recent old descriptor for that router, and either:
- There are non-cosmetic differences between the old descriptor and the
new one.
- Enough time has passed between the descriptors' publication times.
(Currently, 12 hours.)
Differences between router descriptors are "non-cosmetic" if they would be
sufficient to force an upload as described in section 2 above.
Note that the "cosmetic difference" test only applies to uploaded
descriptors, not to descriptors that the authority downloads from other
authorities.
4.2. Downloading network-status documents (authorities and caches)
All directory servers (authorities and mirrors) try to keep a fresh
set of network-status documents from every authority. To do so,
every 5 minutes, each authority asks every other authority for its
most recent network-status document. Every 15 minutes, each mirror
picks a random authority and asks it for the most recent network-status
documents for all the authorities the authority knows about (including
the chosen authority itself).
Directory servers and mirrors remember and serve the most recent
network-status document they have from each authority. Other
network-status documents don't need to be stored. If the most recent
network-status document is over 10 days old, it is discarded anyway.
Mirrors SHOULD store and serve network-status documents from authorities
they don't recognize, but SHOULD NOT use such documents for any other
purpose. Mirrors SHOULD discard network-status documents older than 48
hours.
4.3. Downloading and storing router descriptors (authorities and caches)
Periodically (currently, every 10 seconds), directory servers check
whether there are any specific descriptors (as identified by descriptor
hash in a network-status document) that they do not have and that they
are not currently trying to download.
If so, the directory server launches requests to the authorities for these
descriptors, such that each authority is only asked for descriptors listed
in its most recent network-status. When more than one authority lists the
descriptor, we choose which to ask at random.
If one of these downloads fails, we do not try to download that descriptor
from the authority that failed to serve it again unless we receive a newer
network-status from that authority that lists the same descriptor.
Directory servers must potentially cache multiple descriptors for each
router. Servers must not discard any descriptor listed by any current
network-status document from any authority. If there is enough space to
store additional descriptors, servers SHOULD try to hold those which
clients are likely to download the most. (Currently, this is judged
based on the interval for which each descriptor seemed newest.)
Authorities SHOULD NOT download descriptors for routers that they would
immediately reject for reasons listed in 3.1.
4.4. HTTP URLs
"Fingerprints" in these URLs are base-16-encoded SHA1 hashes.
The authoritative network-status published by a host should be available at:
http://<hostname>/tor/status/authority.z
The network-status published by a host with fingerprint
<F> should be available at:
http://<hostname>/tor/status/fp/<F>.z
The network-status documents published by hosts with fingerprints
<F1>,<F2>,<F3> should be available at:
http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z
The most recent network-status documents from all known authorities,
concatenated, should be available at:
http://<hostname>/tor/status/all.z
The most recent descriptor for a server whose identity key has a
fingerprint of <F> should be available at:
http://<hostname>/tor/server/fp/<F>.z
The most recent descriptors for servers with identity fingerprints
<F1>,<F2>,<F3> should be available at:
http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
(NOTE: Implementations SHOULD NOT download descriptors by identity key
fingerprint. This allows a corrupted server (in collusion with a cache) to
provide a unique descriptor to a client, and thereby partition that client
from the rest of the network.)
The server descriptor with (descriptor) digest <D> (in hex) should be
available at:
http://<hostname>/tor/server/d/<D>.z
The most recent descriptors with digests <D1>,<D2>,<D3> should be
available at:
http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
The most recent descriptor for this server should be at:
http://<hostname>/tor/server/authority.z
[Nothing in the Tor protocol uses this resource yet, but it is useful
for debugging purposes. Also, the official Tor implementations
(starting at 0.1.1.x) use this resource to test whether a server's
own DirPort is reachable.]
A concatenated set of the most recent descriptors for all known servers
should be available at:
http://<hostname>/tor/server/all.z
For debugging, directories SHOULD expose non-compressed objects at URLs like
the above, but without the final ".z".
Clients MUST handle compressed concatenated information in two forms:
- A concatenated list of zlib-compressed objects.
- A zlib-compressed concatenated list of objects.
Directory servers MAY generate either format: the former requires less
CPU, but the latter requires less bandwidth.
Clients SHOULD use upper case letters (A-F) when base16-encoding
fingerprints. Servers MUST accept both upper and lower case fingerprints
in requests.
5. Client operation: downloading information
Every Tor that is not a directory server (that is, those that do
not have a DirPort set) implements this section.
5.1. Downloading network-status documents
Each client maintains an ordered list of directory authorities.
Insofar as possible, clients SHOULD all use the same ordered list.
For each network-status document a client has, it keeps track of its
publication time *and* the time when the client retrieved it. Clients
consider a network-status document "live" if it was published within the
last 24 hours.
Clients try to have a live network-status document hours from *every*
authority, and try to periodically get new network-status documents from
each authority in rotation as follows:
If a client is missing a live network-status document for any
authority, it tries to fetch it from a directory cache. On failure,
the client waits briefly, then tries that network-status document
again from another cache. The client does not build circuits until it
has live network-status documents from more than half the authorities
it trusts, and it has descriptors for more than 1/4 of the routers
that it believes are running.
If the most recently _retrieved_ network-status document is over 30
minutes old, the client attempts to download a network-status document.
When choosing which documents to download, clients treat their list of
directory authorities as a circular ring, and begin with the authority
appearing immediately after the authority for their most recently
retrieved network-status document. If this attempt fails (either it
fails to download at all, or the one it gets is not as good as the
one it has), the client retries at other caches several times, before
moving on to the next network-status document in sequence.
Clients discard all network-status documents over 24 hours old.
If enough mirrors (currently 4) claim not to have a given network status,
we stop trying to download that authority's network-status, until we
download a new network-status that makes us believe that the authority in
question is running. Clients should wait a little longer after each
failure.
Clients SHOULD try to batch as many network-status requests as possible
into each HTTP GET.
(Note: clients can and should pick caches based on the network-status
information they have: once they have first fetched network-status info
from an authority, they should not need to go to the authority directly
again.)
5.2. Downloading and storing router descriptors
Clients try to have the best descriptor for each router. A descriptor is
"best" if:
* It is the most recently published descriptor listed for that router
by at least two network-status documents.
OR,
* No descriptor for that router is listed by two or more
network-status documents, and it is the most recently published
descriptor listed by any network-status document.
Periodically (currently every 10 seconds) clients check whether there are
any "downloadable" descriptors. A descriptor is downloadable if:
- It is the "best" descriptor for some router.
- The descriptor was published at least 10 minutes in the past.
(This prevents clients from trying to fetch descriptors that the
mirrors have probably not yet retrieved and cached.)
- The client does not currently have it.
- The client is not currently trying to download it.
- The client would not discard it immediately upon receiving it.
- The client thinks it is running and valid (see 6.1 below).
If at least 16 known routers have downloadable descriptors, or if
enough time (currently 10 minutes) has passed since the last time the
client tried to download descriptors, it launches requests for all
downloadable descriptors, as described in 5.3 below.
When a descriptor download fails, the client notes it, and does not
consider the descriptor downloadable again until a certain amount of time
has passed. (Currently 0 seconds for the first failure, 60 seconds for the
second, 5 minutes for the third, 10 minutes for the fourth, and 1 day
thereafter.) Periodically (currently once an hour) clients reset the
failure count.
No descriptors are downloaded until the client has downloaded more than
half of the network-status documents.
Clients retain the most recent descriptor they have downloaded for each
router so long as it is not too old (currently, 48 hours), OR so long as
it is recommended by at least one networkstatus AND no "better"
descriptor has been downloaded. [Versions of Tor before 0.1.2.3-alpha
would discard descriptors simply for being published too far in the past.]
[The code seems to discard descriptors in all cases after they're 5
days old. True? -RD]
5.3. Managing downloads
When a client has no live network-status documents, it downloads
network-status documents from a randomly chosen authority. In all other
cases, the client downloads from mirrors randomly chosen from among those
believed to be V2 directory servers. (This information comes from the
network-status documents; see 6 below.)
When downloading multiple router descriptors, the client chooses multiple
mirrors so that:
- At least 3 different mirrors are used, except when this would result
in more than one request for under 4 descriptors.
- No more than 128 descriptors are requested from a single mirror.
- Otherwise, as few mirrors as possible are used.
After choosing mirrors, the client divides the descriptors among them
randomly.
After receiving any response client MUST discard any network-status
documents and descriptors that it did not request.
6. Using directory information
Everyone besides directory authorities uses the approaches in this section
to decide which servers to use and what their keys are likely to be.
(Directory authorities just believe their own opinions, as in 3.1 above.)
6.1. Choosing routers for circuits.
Tor implementations only pay attention to "live" network-status documents.
A network status is "live" if it is the most recently downloaded network
status document for a given directory server, and the server is a
directory server trusted by the client, and the network-status document is
no more than 1 day old.
For time-sensitive information, Tor implementations focus on "recent"
network-status documents. A network status is "recent" if it is live, and
if it was published in the last 60 minutes. If there are fewer
than 3 such documents, the most recently published 3 are "recent." If
there are fewer than 3 in all, all are "recent.")
Circuits SHOULD NOT be built until the client has enough directory
information: network-statuses (or failed attempts to download
network-statuses) for all authorities, network-statuses for at more than
half of the authorities, and descriptors for at least 1/4 of the servers
believed to be running.
A server is "listed" if it is included by more than half of the live
network status documents. Clients SHOULD NOT use unlisted servers.
Clients believe the flags "Valid", "Exit", "Fast", "Guard", "Stable", and
"V2Dir" about a given router when they are asserted by more than half of
the live network-status documents. Clients believe the flag "Running" if
it is listed by more than half of the recent network-status documents.
These flags are used as follows:
- Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless
requested to do so.
- Clients SHOULD NOT use non-'Fast' routers for any purpose other than
very-low-bandwidth circuits (such as introduction circuits).
- Clients SHOULD NOT use non-'Stable' routers for circuits that are
likely to need to be open for a very long time (such as those used for
IRC or SSH connections).
- Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard
nodes.
- Clients SHOULD NOT download directory information from non-'V2Dir'
caches.
6.2. Managing naming
In order to provide human-memorable names for individual server
identities, some directory servers bind names to IDs. Clients handle
names in two ways:
When a client encounters a name it has not mapped before:
If all the live "Naming" network-status documents the client has
claim that the name binds to some identity ID, and the client has at
least three live network-status documents, the client maps the name to
ID.
When a user tries to refer to a router with a name that does not have a
mapping under the above rules, the implementation SHOULD warn the user.
After giving the warning, the implementation MAY use a router that at
least one Naming authority maps the name to, so long as no other naming
authority maps that name to a different router. If no Naming authority
maps the name to a router, the implementation MAY use any router that
advertises the name.
Not every router needs a nickname. When a router doesn't configure a
nickname, it publishes with the default nickname "Unnamed". Authorities
SHOULD NOT ever mark a router with this nickname as Named; client software
SHOULD NOT ever use a router in response to a user request for a router
called "Unnamed".
6.3. Software versions
An implementation of Tor SHOULD warn when it has fetched (or has
attempted to fetch and failed four consecutive times) a network-status
for each authority, and it is running a software version
not listed on more than half of the live "Versioning" network-status
documents.
6.4. Warning about a router's status.
If a router tries to publish its descriptor to a Naming authority
that has its nickname mapped to another key, the router SHOULD
warn the operator that it is either using the wrong key or is using
an already claimed nickname.
If a router has fetched (or attempted to fetch and failed four
consecutive times) a network-status for every authority, and at
least one of the authorities is "Naming", and no live "Naming"
authorities publish a binding for the router's nickname, the
router MAY remind the operator that the chosen nickname is not
bound to this key at the authorities, and suggest contacting the
authority operators.
...
6.5. Router protocol versions
A client should believe that a router supports a given feature if that
feature is supported by the router or protocol versions in more than half
of the live networkstatus's "v" entries for that router. In other words,
if the "v" entries for some router are:
v Tor 0.0.8pre1 (from authority 1)
v Tor 0.1.2.11 (from authority 2)
v FutureProtocolDescription 99 (from authority 3)
then the client should believe that the router supports any feature
supported by 0.1.2.11.
This is currently equivalent to believing the median declared version for
a router in all live networkstatuses.
7. Standards compliance
All clients and servers MUST support HTTP 1.0.
7.1. HTTP headers
Servers MAY set the Content-Length: header. Servers SHOULD set
Content-Encoding to "deflate" or "identity".
Servers MAY include an X-Your-Address-Is: header, whose value is the
apparent IP address of the client connecting to them (as a dotted quad).
For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD
report the IP from which the circuit carrying the BEGIN_DIR stream reached
them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all
BEGIN_DIR-tunneled connections.]
Servers SHOULD disable caching of multiple network statuses or multiple
router descriptors. Servers MAY enable caching of single descriptors,
single network statuses, the list of all router descriptors, a v1
directory, or a v1 running routers document. XXX mention times.
7.2. HTTP status codes
XXX We should write down what return codes dirservers send in what situations.

File diff suppressed because it is too large Load Diff

View File

@ -1,657 +0,0 @@
Tor Path Specification
Roger Dingledine
Nick Mathewson
Note: This is an attempt to specify Tor as currently implemented. Future
versions of Tor will implement improved algorithms.
This document tries to cover how Tor chooses to build circuits and assign
streams to circuits. Other implementations MAY take other approaches, but
implementors should be aware of the anonymity and load-balancing implications
of their choices.
THIS SPEC ISN'T DONE YET.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119.
1. General operation
Tor begins building circuits as soon as it has enough directory
information to do so (see section 5 of dir-spec.txt). Some circuits are
built preemptively because we expect to need them later (for user
traffic), and some are built because of immediate need (for user traffic
that no current circuit can handle, for testing the network or our
reachability, and so on).
When a client application creates a new stream (by opening a SOCKS
connection or launching a resolve request), we attach it to an appropriate
open circuit if one exists, or wait if an appropriate circuit is
in-progress. We launch a new circuit only
if no current circuit can handle the request. We rotate circuits over
time to avoid some profiling attacks.
To build a circuit, we choose all the nodes we want to use, and then
construct the circuit. Sometimes, when we want a circuit that ends at a
given hop, and we have an appropriate unused circuit, we "cannibalize" the
existing circuit and extend it to the new terminus.
These processes are described in more detail below.
This document describes Tor's automatic path selection logic only; path
selection can be overridden by a controller (with the EXTENDCIRCUIT and
ATTACHSTREAM commands). Paths constructed through these means may
violate some constraints given below.
1.1. Terminology
A "path" is an ordered sequence of nodes, not yet built as a circuit.
A "clean" circuit is one that has not yet been used for any traffic.
A "fast" or "stable" or "valid" node is one that has the 'Fast' or
'Stable' or 'Valid' flag
set respectively, based on our current directory information. A "fast"
or "stable" circuit is one consisting only of "fast" or "stable" nodes.
In an "exit" circuit, the final node is chosen based on waiting stream
requests if any, and in any case it avoids nodes with exit policy of
"reject *:*". An "internal" circuit, on the other hand, is one where
the final node is chosen just like a middle node (ignoring its exit
policy).
A "request" is a client-side stream or DNS resolve that needs to be
served by a circuit.
A "pending" circuit is one that we have started to build, but which has
not yet completed.
A circuit or path "supports" a request if it is okay to use the
circuit/path to fulfill the request, according to the rules given below.
A circuit or path "might support" a request if some aspect of the request
is unknown (usually its target IP), but we believe the path probably
supports the request according to the rules given below.
1.1. A server's bandwidth
Old versions of Tor did not report bandwidths in network status
documents, so clients had to learn them from the routers' advertised
server descriptors.
For versions of Tor prior to 0.2.1.17-rc, everywhere below where we
refer to a server's "bandwidth", we mean its clipped advertised
bandwidth, computed by taking the smaller of the 'rate' and
'observed' arguments to the "bandwidth" element in the server's
descriptor. If a router's advertised bandwidth is greater than
MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that
value.
For more recent versions of Tor, we take the bandwidth value declared
in the consensus, and fall back to the clipped advertised bandwidth
only if the consensus does not have bandwidths listed.
2. Building circuits
2.1. When we build
2.1.1. Clients build circuits preemptively
When running as a client, Tor tries to maintain at least a certain
number of clean circuits, so that new streams can be handled
quickly. To increase the likelihood of success, Tor tries to
predict what circuits will be useful by choosing from among nodes
that support the ports we have used in the recent past (by default
one hour). Specifically, on startup Tor tries to maintain one clean
fast exit circuit that allows connections to port 80, and at least
two fast clean stable internal circuits in case we get a resolve
request or hidden service request (at least three if we _run_ a
hidden service).
After that, Tor will adapt the circuits that it preemptively builds
based on the requests it sees from the user: it tries to have two fast
clean exit circuits available for every port seen within the past hour
(each circuit can be adequate for many predicted ports -- it doesn't
need two separate circuits for each port), and it tries to have the
above internal circuits available if we've seen resolves or hidden
service activity within the past hour. If there are 12 or more clean
circuits open, it doesn't open more even if it has more predictions.
Only stable circuits can "cover" a port that is listed in the
LongLivedPorts config option. Similarly, hidden service requests
to ports listed in LongLivedPorts make us create stable internal
circuits.
Note that if there are no requests from the user for an hour, Tor
will predict no use and build no preemptive circuits.
The Tor client SHOULD NOT store its list of predicted requests to a
persistent medium.
2.1.2. Clients build circuits on demand
Additionally, when a client request exists that no circuit (built or
pending) might support, we create a new circuit to support the request.
For exit connections, we pick an exit node that will handle the
most pending requests (choosing arbitrarily among ties), launch a
circuit to end there, and repeat until every unattached request
might be supported by a pending or built circuit. For internal
circuits, we pick an arbitrary acceptable path, repeating as needed.
In some cases we can reuse an already established circuit if it's
clean; see Section 2.3 (cannibalizing circuits) for details.
2.1.3. Servers build circuits for testing reachability and bandwidth
Tor servers test reachability of their ORPort once they have
successfully built a circuit (on start and whenever their IP address
changes). They build an ordinary fast internal circuit with themselves
as the last hop. As soon as any testing circuit succeeds, the Tor
server decides it's reachable and is willing to publish a descriptor.
We launch multiple testing circuits (one at a time), until we
have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we
do a "bandwidth test" by sending a certain number of relay drop
cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE
total cells divided across the four circuits, but never more than
CIRCWINDOW_START (1000) cells total. This exercises both outgoing and
incoming bandwidth, and helps to jumpstart the observed bandwidth
(see dir-spec.txt).
Tor servers also test reachability of their DirPort once they have
established a circuit, but they use an ordinary exit circuit for
this purpose.
2.1.4. Hidden-service circuits
See section 4 below.
2.1.5. Rate limiting of failed circuits
If we fail to build a circuit N times in a X second period (see Section
2.3 for how this works), we stop building circuits until the X seconds
have elapsed.
XXXX
2.1.6. When to tear down circuits
XXXX
2.2. Path selection and constraints
We choose the path for each new circuit before we build it. We choose the
exit node first, followed by the other nodes in the circuit. All paths
we generate obey the following constraints:
- We do not choose the same router twice for the same path.
- We do not choose any router in the same family as another in the same
path.
- We do not choose more than one router in a given /16 subnet
(unless EnforceDistinctSubnets is 0).
- We don't choose any non-running or non-valid router unless we have
been configured to do so. By default, we are configured to allow
non-valid routers in "middle" and "rendezvous" positions.
- If we're using Guard nodes, the first node must be a Guard (see 5
below)
- XXXX Choosing the length
For "fast" circuits, we only choose nodes with the Fast flag. For
non-"fast" circuits, all nodes are eligible.
For all circuits, we weight node selection according to router bandwidth.
We also weight the bandwidth of Exit and Guard flagged nodes depending on
the fraction of total bandwidth that they make up and depending upon the
position they are being selected for.
These weights are published in the consensus, and are computed as described
in Section 3.4.3 of dir-spec.txt. They are:
Wgg - Weight for Guard-flagged nodes in the guard position
Wgm - Weight for non-flagged nodes in the guard Position
Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
Wmg - Weight for Guard-flagged nodes in the middle Position
Wmm - Weight for non-flagged nodes in the middle Position
Wme - Weight for Exit-flagged nodes in the middle Position
Wmd - Weight for Guard+Exit flagged nodes in the middle Position
Weg - Weight for Guard flagged nodes in the exit Position
Wem - Weight for non-flagged nodes in the exit Position
Wee - Weight for Exit-flagged nodes in the exit Position
Wed - Weight for Guard+Exit-flagged nodes in the exit Position
Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
Additionally, we may be building circuits with one or more requests in
mind. Each kind of request puts certain constraints on paths:
- All service-side introduction circuits and all rendezvous paths
should be Stable.
- All connection requests for connections that we think will need to
stay open a long time require Stable circuits. Currently, Tor decides
this by examining the request's target port, and comparing it to a
list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050,
5190, 5222, 5223, 6667, 6697, 8300.)
- DNS resolves require an exit node whose exit policy is not equivalent
to "reject *:*".
- Reverse DNS resolves require a version of Tor with advertised eventdns
support (available in Tor 0.1.2.1-alpha-dev and later).
- All connection requests require an exit node whose exit policy
supports their target address and port (if known), or which "might
support it" (if the address isn't known). See 2.2.1.
- Rules for Fast? XXXXX
2.2.1. Choosing an exit
If we know what IP address we want to connect to or resolve, we can
trivially tell whether a given router will support it by simulating
its declared exit policy.
Because we often connect to addresses of the form hostname:port, we do not
always know the target IP address when we select an exit node. In these
cases, we need to pick an exit node that "might support" connections to a
given address port with an unknown address. An exit node "might support"
such a connection if any clause that accepts any connections to that port
precedes all clauses (if any) that reject all connections to that port.
Unless requested to do so by the user, we never choose an exit server
flagged as "BadExit" by more than half of the authorities who advertise
themselves as listing bad exits.
2.2.2. User configuration
Users can alter the default behavior for path selection with configuration
options.
- If "ExitNodes" is provided, then every request requires an exit node on
the ExitNodes list. (If a request is supported by no nodes on that list,
and StrictExitNodes is false, then Tor treats that request as if
ExitNodes were not provided.)
- "EntryNodes" and "StrictEntryNodes" behave analogously.
- If a user tries to connect to or resolve a hostname of the form
<target>.<servername>.exit, the request is rewritten to a request for
<target>, and the request is only supported by the exit whose nickname
or fingerprint is <servername>.
2.3. Cannibalizing circuits
If we need a circuit and have a clean one already established, in
some cases we can adapt the clean circuit for our new
purpose. Specifically,
For hidden service interactions, we can "cannibalize" a clean internal
circuit if one is available, so we don't need to build those circuits
from scratch on demand.
We can also cannibalize clean circuits when the client asks to exit
at a given node -- either via the ".exit" notation or because the
destination is running at the same location as an exit node.
2.4. Learning when to give up ("timeout") on circuit construction
Since version 0.2.2.8-alpha, Tor attempts to learn when to give up on
circuits based on network conditions.
2.4.1 Distribution choice and parameter estimation
Based on studies of build times, we found that the distribution of
circuit build times appears to be a Frechet distribution. However,
estimators and quantile functions of the Frechet distribution are
difficult to work with and slow to converge. So instead, since we
are only interested in the accuracy of the tail, we approximate
the tail of the distribution with a Pareto curve.
We calculate the parameters for a Pareto distribution fitting the data
using the estimators in equation 4 from:
http://portal.acm.org/citation.cfm?id=1647962.1648139
This is:
alpha_m = s/(ln(U(X)/Xm^n))
where s is the total number of completed circuits we have seen, and
U(X) = x_max^u * Prod_s{x_i}
with x_i as our i-th completed circuit time, x_max as the longest
completed circuit build time we have yet observed, u as the
number of unobserved timeouts that have no exact value recorded,
and n as u+s, the total number of circuits that either timeout or
complete.
Using log laws, we compute this as the sum of logs to avoid
overflow and ln(1.0+epsilon) precision issues:
alpha_m = s/(u*ln(x_max) + Sum_s{ln(x_i)} - n*ln(Xm))
This estimator is closely related to the parameters present in:
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation
except they are adjusted to handle the fact that our samples are
right-censored at the timeout cutoff.
Additionally, because this is not a true Pareto distribution, we alter
how Xm is computed. The Xm parameter is computed as the midpoint of the most
frequently occurring 50ms histogram bin, until the point where 1000
circuits are recorded. After this point, the weighted average of the top
'cbtnummodes' (default: 3) midpoint modes is used as Xm. All times below
this value are counted as having the midpoint value of this weighted average bin.
The timeout itself is calculated by using the Pareto Quantile function (the
inverted CDF) to give us the value on the CDF such that 80% of the mass
of the distribution is below the timeout value.
Thus, we expect that the Tor client will accept the fastest 80% of
the total number of paths on the network.
2.4.2. How much data to record
From our observations, the minimum number of circuit build times for a
reasonable fit appears to be on the order of 100. However, to keep a
good fit over the long term, we store 1000 most recent circuit build times
in a circular array.
The Tor client should build test circuits at a rate of one per
minute up until 100 circuits are built. This allows a fresh Tor to have
a CircuitBuildTimeout estimated within 1.5 hours after install,
upgrade, or network change (see below).
Timeouts are stored on disk in a histogram of 50ms bin width, the same
width used to calculate the Xm value above. This histogram must be shuffled
after being read from disk, to preserve a proper expiration of old values
after restart.
2.4.3. How to record timeouts
Circuits that pass the timeout threshold should be allowed to continue
building until a time corresponding to the point 'cbtclosequantile'
(default 95) on the Pareto curve, or 60 seconds, whichever is greater.
The actual completion times for these circuits should be recorded.
Implementations should completely abandon a circuit and record a value
as an 'unknown' timeout if the total build time exceeds this threshold.
The reason for this is that right-censored pareto estimators begin to lose
their accuracy if more than approximately 5% of the values are censored.
Since we wish to set the cutoff at 20%, we must allow circuits to continue
building past this cutoff point up to the 95th percentile.
2.4.4. Detecting Changing Network Conditions
We attempt to detect both network connectivity loss and drastic
changes in the timeout characteristics.
We assume that we've had network connectivity loss if 3 circuits
timeout and we've received no cells or TLS handshakes since those
circuits began. We then temporarily set the timeout to 60 seconds
and stop counting timeouts.
If 3 more circuits timeout and the network still has not been
live within this new 60 second timeout window, we then discard
the previous timeouts during this period from our history.
To detect changing network conditions, we keep a history of
the timeout or non-timeout status of the past 20 circuits that
successfully completed at least one hop. If more than 90% of
these circuits timeout, we discard all buildtimes history, reset
the timeout to 60, and then begin recomputing the timeout.
If the timeout was already 60 or higher, we double the timeout.
2.4.5. Consensus parameters governing behavior
Clients that implement circuit build timeout learning should obey the
following consensus parameters that govern behavior, in order to allow
us to handle bugs or other emergent behaviors due to client circuit
construction. If these parameters are not present in the consensus,
the listed default values should be used instead.
cbtdisabled
Default: 0
Min: 0
Max: 1
Effect: If 1, all CircuitBuildTime learning code should be
disabled and history should be discarded. For use in
emergency situations only.
cbtnummodes
Default: 3
Min: 1
Max: 20
Effect: This value governs how many modes to use in the weighted
average calculation of Pareto parameter Xm. A value of 3 introduces
some bias (2-5% of CDF) under ideal conditions, but allows for better
performance in the event that a client chooses guard nodes of radically
different performance characteristics.
cbtrecentcount
Default: 20
Min: 3
Max: 1000
Effect: This is the number of circuit build times to keep track of
for the following option.
cbtmaxtimeouts
Default: 18
Min: 3
Max: 10000
Effect: When this many timeouts happen in the last 'cbtrecentcount'
circuit attempts, the client should discard all of its
history and begin learning a fresh timeout value.
cbtmincircs
Default: 100
Min: 1
Max: 10000
Effect: This is the minimum number of circuits to build before
computing a timeout.
cbtquantile
Default: 80
Min: 10
Max: 99
Effect: This is the position on the quantile curve to use to set the
timeout value. It is a percent (10-99).
cbtclosequantile
Default: 95
Min: Value of cbtquantile parameter
Max: 99
Effect: This is the position on the quantile curve to use to set the
timeout value to use to actually close circuits. It is a percent
(0-99).
cbttestfreq
Default: 60
Min: 1
Max: 2147483647 (INT32_MAX)
Effect: Describes how often in seconds to build a test circuit to
gather timeout values. Only applies if less than 'cbtmincircs'
have been recorded.
cbtmintimeout
Default: 2000
Min: 500
Max: 2147483647 (INT32_MAX)
Effect: This is the minimum allowed timeout value in milliseconds.
The minimum is to prevent rounding to 0 (we only check once
per second).
cbtinitialtimeout
Default: 60000
Min: Value of cbtmintimeout
Max: 2147483647 (INT32_MAX)
Effect: This is the timeout value to use before computing a timeout,
in milliseconds.
2.5. Handling failure
If an attempt to extend a circuit fails (either because the first create
failed or a subsequent extend failed) then the circuit is torn down and is
no longer pending. (XXXX really?) Requests that might have been
supported by the pending circuit thus become unsupported, and a new
circuit needs to be constructed.
If a stream "begin" attempt fails with an EXITPOLICY error, we
decide that the exit node's exit policy is not correctly advertised,
so we treat the exit node as if it were a non-exit until we retrieve
a fresh descriptor for it.
XXXX
3. Attaching streams to circuits
When a circuit that might support a request is built, Tor tries to attach
the request's stream to the circuit and sends a BEGIN, BEGIN_DIR,
or RESOLVE relay
cell as appropriate. If the request completes unsuccessfully, Tor
considers the reason given in the CLOSE relay cell. [XXX yes, and?]
After a request has remained unattached for SocksTimeout (2 minutes
by default), Tor abandons the attempt and signals an error to the
client as appropriate (e.g., by closing the SOCKS connection).
XXX Timeouts and when Tor auto-retries.
* What stream-end-reasons are appropriate for retrying.
If no reply to BEGIN/RESOLVE, then the stream will timeout and fail.
4. Hidden-service related circuits
XXX Tracking expected hidden service use (client-side and hidserv-side)
5. Guard nodes
We use Guard nodes (also called "helper nodes" in the literature) to
prevent certain profiling attacks. Here's the risk: if we choose entry and
exit nodes at random, and an attacker controls C out of N servers
(ignoring bandwidth), then the
attacker will control the entry and exit node of any given circuit with
probability (C/N)^2. But as we make many different circuits over time,
then the probability that the attacker will see a sample of about (C/N)^2
of our traffic goes to 1. Since statistical sampling works, the attacker
can be sure of learning a profile of our behavior.
If, on the other hand, we picked an entry node and held it fixed, we would
have probability C/N of choosing a bad entry and being profiled, and
probability (N-C)/N of choosing a good entry and not being profiled.
When guard nodes are enabled, Tor maintains an ordered list of entry nodes
as our chosen guards, and stores this list persistently to disk. If a Guard
node becomes unusable, rather than replacing it, Tor adds new guards to the
end of the list. When choosing the first hop of a circuit, Tor
chooses at
random from among the first NumEntryGuards (default 3) usable guards on the
list. If there are not at least 2 usable guards on the list, Tor adds
routers until there are, or until there are no more usable routers to add.
A guard is unusable if any of the following hold:
- it is not marked as a Guard by the networkstatuses,
- it is not marked Valid (and the user hasn't set AllowInvalid entry)
- it is not marked Running
- Tor couldn't reach it the last time it tried to connect
A guard is unusable for a particular circuit if any of the rules for path
selection in 2.2 are not met. In particular, if the circuit is "fast"
and the guard is not Fast, or if the circuit is "stable" and the guard is
not Stable, or if the guard has already been chosen as the exit node in
that circuit, Tor can't use it as a guard node for that circuit.
If the guard is excluded because of its status in the networkstatuses for
over 30 days, Tor removes it from the list entirely, preserving order.
If Tor fails to connect to an otherwise usable guard, it retries
periodically: every hour for six hours, every 4 hours for 3 days, every
18 hours for a week, and every 36 hours thereafter. Additionally, Tor
retries unreachable guards the first time it adds a new guard to the list,
since it is possible that the old guards were only marked as unreachable
because the network was unreachable or down.
Tor does not add a guard persistently to the list until the first time we
have connected to it successfully.
6. Router descriptor purposes
There are currently three "purposes" supported for router descriptors:
general, controller, and bridge. Most descriptors are of type general
-- these are the ones listed in the consensus, and the ones fetched
and used in normal cases.
Controller-purpose descriptors are those delivered by the controller
and labelled as such: they will be kept around (and expire like
normal descriptors), and they can be used by the controller in its
CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it
chooses paths.
Bridge-purpose descriptors are for routers that are used as bridges. See
doc/design-paper/blocking.pdf for more design explanation, or proposal
125 for specific details. Currently bridge descriptors are used in place
of normal entry guards, for Tor clients that have UseBridges enabled.
X. Old notes
X.1. Do we actually do this?
How to deal with network down.
- While all helpers are down/unreachable and there are no established
or on-the-way testing circuits, launch a testing circuit. (Do this
periodically in the same way we try to establish normal circuits
when things are working normally.)
(Testing circuits are a special type of circuit, that streams won't
attach to by accident.)
- When a testing circuit succeeds, mark all helpers up and hold
the testing circuit open.
- If a connection to a helper succeeds, close all testing circuits.
Else mark that helper down and try another.
- If the last helper is marked down and we already have a testing
circuit established, then add the first hop of that testing circuit
to the end of our helper node list, close that testing circuit,
and go back to square one. (Actually, rather than closing the
testing circuit, can we get away with converting it to a normal
circuit and beginning to use it immediately?)
[Do we actually do any of the above? If so, let's spec it. If not, let's
remove it. -NM]
X.2. A thing we could do to deal with reachability.
And as a bonus, it leads to an answer to Nick's attack ("If I pick
my helper nodes all on 18.0.0.0:*, then I move, you'll know where I
bootstrapped") -- the answer is to pick your original three helper nodes
without regard for reachability. Then the above algorithm will add some
more that are reachable for you, and if you move somewhere, it's more
likely (though not certain) that some of the originals will become useful.
Is that smart or just complex?
X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm.
It is unlikely for two users to have the same set of entry guards.
Observing a user is sufficient to learn its entry guards. So, as we move
around, entry guards make us linkable. If we want to change guards when
our location (IP? subnet?) changes, we have two bad options. We could
- Drop the old guards. But if we go back to our old location,
we'll not use our old guards. For a laptop that sometimes gets used
from work and sometimes from home, this is pretty fatal.
- Remember the old guards as associated with the old location, and use
them again if we ever go back to the old location. This would be
nasty, since it would force us to record where we've been.
[Do we do any of this now? If not, this should move into 099-misc or
098-todo. -NM]

View File

@ -1,188 +0,0 @@
Filename: 000-index.txt
Title: Index of Tor Proposals
Author: Nick Mathewson
Created: 26-Jan-2007
Status: Meta
Overview:
This document provides an index to Tor proposals.
This is an informational document.
Everything in this document below the line of '=' signs is automatically
generated by reindex.py; do not edit by hand.
============================================================
Proposals by number:
000 Index of Tor Proposals [META]
001 The Tor Proposal Process [META]
098 Proposals that should be written [META]
099 Miscellaneous proposals [META]
100 Tor Unreliable Datagram Extension Proposal [DEAD]
101 Voting on the Tor Directory System [CLOSED]
102 Dropping "opt" from the directory format [CLOSED]
103 Splitting identity key from regularly used signing key [CLOSED]
104 Long and Short Router Descriptors [CLOSED]
105 Version negotiation for the Tor protocol [CLOSED]
106 Checking fewer things during TLS handshakes [CLOSED]
107 Uptime Sanity Checking [CLOSED]
108 Base "Stable" Flag on Mean Time Between Failures [CLOSED]
109 No more than one server per IP address [CLOSED]
110 Avoiding infinite length circuits [ACCEPTED]
111 Prioritizing local traffic over relayed traffic [CLOSED]
112 Bring Back Pathlen Coin Weight [SUPERSEDED]
113 Simplifying directory authority administration [SUPERSEDED]
114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED]
115 Two Hop Paths [DEAD]
116 Two hop paths from entry guards [DEAD]
117 IPv6 exits [ACCEPTED]
118 Advertising multiple ORPorts at once [ACCEPTED]
119 New PROTOCOLINFO command for controllers [CLOSED]
120 Shutdown descriptors when Tor servers stop [DEAD]
121 Hidden Service Authentication [FINISHED]
122 Network status entries need a new Unnamed flag [CLOSED]
123 Naming authorities automatically create bindings [CLOSED]
124 Blocking resistant TLS certificate usage [SUPERSEDED]
125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED]
126 Getting GeoIP data and publishing usage summaries [CLOSED]
127 Relaying dirport requests to Tor download site / website [DRAFT]
128 Families of private bridges [DEAD]
129 Block Insecure Protocols by Default [CLOSED]
130 Version 2 Tor connection protocol [CLOSED]
131 Help users to verify they are using Tor [NEEDS-REVISION]
132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT]
133 Incorporate Unreachable ORs into the Tor Network [DRAFT]
134 More robust consensus voting with diverse authority sets [REJECTED]
135 Simplify Configuration of Private Tor Networks [CLOSED]
136 Mass authority migration with legacy keys [CLOSED]
137 Keep controllers informed as Tor bootstraps [CLOSED]
138 Remove routers that are not Running from consensus documents [CLOSED]
139 Download consensus documents only when it will be trusted [CLOSED]
140 Provide diffs between consensuses [ACCEPTED]
141 Download server descriptors on demand [DRAFT]
142 Combine Introduction and Rendezvous Points [DEAD]
143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [OPEN]
144 Increase the diversity of circuits by detecting nodes belonging the same provider [DRAFT]
145 Separate "suitable as a guard" from "suitable as a new guard" [OPEN]
146 Add new flag to reflect long-term stability [OPEN]
147 Eliminate the need for v2 directories in generating v3 directories [ACCEPTED]
148 Stream end reasons from the client side should be uniform [CLOSED]
149 Using data from NETINFO cells [OPEN]
150 Exclude Exit Nodes from a circuit [CLOSED]
151 Improving Tor Path Selection [FINISHED]
152 Optionally allow exit from single-hop circuits [CLOSED]
153 Automatic software update protocol [SUPERSEDED]
154 Automatic Software Update Protocol [SUPERSEDED]
155 Four Improvements of Hidden Service Performance [FINISHED]
156 Tracking blocked ports on the client side [OPEN]
157 Make certificate downloads specific [ACCEPTED]
158 Clients download consensus + microdescriptors [OPEN]
159 Exit Scanning [OPEN]
160 Authorities vote for bandwidth offsets in consensus [FINISHED]
161 Computing Bandwidth Adjustments [FINISHED]
162 Publish the consensus in multiple flavors [OPEN]
163 Detecting whether a connection comes from a client [OPEN]
164 Reporting the status of server votes [OPEN]
165 Easy migration for voting authority sets [OPEN]
166 Including Network Statistics in Extra-Info Documents [ACCEPTED]
167 Vote on network parameters in consensus [CLOSED]
168 Reduce default circuit window [OPEN]
169 Eliminate TLS renegotiation for the Tor connection handshake [DRAFT]
170 Configuration options regarding circuit building [DRAFT]
172 GETINFO controller option for circuit information [ACCEPTED]
173 GETINFO Option Expansion [ACCEPTED]
174 Optimistic Data for Tor: Server Side [OPEN]
Proposals by status:
DRAFT:
127 Relaying dirport requests to Tor download site / website
132 A Tor Web Service For Verifying Correct Browser Configuration
133 Incorporate Unreachable ORs into the Tor Network
141 Download server descriptors on demand
144 Increase the diversity of circuits by detecting nodes belonging the same provider
169 Eliminate TLS renegotiation for the Tor connection handshake [for 0.2.2]
170 Configuration options regarding circuit building
NEEDS-REVISION:
131 Help users to verify they are using Tor
OPEN:
143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [for 0.2.1.x]
145 Separate "suitable as a guard" from "suitable as a new guard" [for 0.2.1.x]
146 Add new flag to reflect long-term stability [for 0.2.1.x]
149 Using data from NETINFO cells [for 0.2.1.x]
156 Tracking blocked ports on the client side [for 0.2.?]
158 Clients download consensus + microdescriptors
159 Exit Scanning
162 Publish the consensus in multiple flavors [for 0.2.2]
163 Detecting whether a connection comes from a client [for 0.2.2]
164 Reporting the status of server votes [for 0.2.2]
165 Easy migration for voting authority sets
168 Reduce default circuit window [for 0.2.2]
174 Optimistic Data for Tor: Server Side
ACCEPTED:
110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
117 IPv6 exits [for 0.2.1.x]
118 Advertising multiple ORPorts at once [for 0.2.1.x]
140 Provide diffs between consensuses [for 0.2.2.x]
147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x]
157 Make certificate downloads specific [for 0.2.1.x]
166 Including Network Statistics in Extra-Info Documents [for 0.2.2]
172 GETINFO controller option for circuit information
173 GETINFO Option Expansion
META:
000 Index of Tor Proposals
001 The Tor Proposal Process
098 Proposals that should be written
099 Miscellaneous proposals
FINISHED:
121 Hidden Service Authentication [in 0.2.1.x]
151 Improving Tor Path Selection
155 Four Improvements of Hidden Service Performance [in 0.2.1.x]
160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x]
161 Computing Bandwidth Adjustments [for 0.2.2.x]
CLOSED:
101 Voting on the Tor Directory System [in 0.2.0.x]
102 Dropping "opt" from the directory format [in 0.2.0.x]
103 Splitting identity key from regularly used signing key [in 0.2.0.x]
104 Long and Short Router Descriptors [in 0.2.0.x]
105 Version negotiation for the Tor protocol [in 0.2.0.x]
106 Checking fewer things during TLS handshakes [in 0.2.0.x]
107 Uptime Sanity Checking [in 0.2.0.x]
108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x]
109 No more than one server per IP address [in 0.2.0.x]
111 Prioritizing local traffic over relayed traffic [in 0.2.0.x]
114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x]
119 New PROTOCOLINFO command for controllers [in 0.2.0.x]
122 Network status entries need a new Unnamed flag [in 0.2.0.x]
123 Naming authorities automatically create bindings [in 0.2.0.x]
125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x]
126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x]
129 Block Insecure Protocols by Default [in 0.2.0.x]
130 Version 2 Tor connection protocol [in 0.2.0.x]
135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha]
136 Mass authority migration with legacy keys [in 0.2.0.x]
137 Keep controllers informed as Tor bootstraps [in 0.2.1.x]
138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha]
139 Download consensus documents only when it will be trusted [in 0.2.1.x]
148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
167 Vote on network parameters in consensus [in 0.2.2]
SUPERSEDED:
112 Bring Back Pathlen Coin Weight
113 Simplifying directory authority administration
124 Blocking resistant TLS certificate usage
153 Automatic software update protocol
154 Automatic Software Update Protocol
DEAD:
100 Tor Unreliable Datagram Extension Proposal
115 Two Hop Paths
116 Two hop paths from entry guards
120 Shutdown descriptors when Tor servers stop
128 Families of private bridges
142 Combine Introduction and Rendezvous Points
REJECTED:
134 More robust consensus voting with diverse authority sets

View File

@ -1,184 +0,0 @@
Filename: 001-process.txt
Title: The Tor Proposal Process
Author: Nick Mathewson
Created: 30-Jan-2007
Status: Meta
Overview:
This document describes how to change the Tor specifications, how Tor
proposals work, and the relationship between Tor proposals and the
specifications.
This is an informational document.
Motivation:
Previously, our process for updating the Tor specifications was maximally
informal: we'd patch the specification (sometimes forking first, and
sometimes not), then discuss the patches, reach consensus, and implement
the changes.
This had a few problems.
First, even at its most efficient, the old process would often have the
spec out of sync with the code. The worst cases were those where
implementation was deferred: the spec and code could stay out of sync for
versions at a time.
Second, it was hard to participate in discussion, since you had to know
which portions of the spec were a proposal, and which were already
implemented.
Third, it littered the specifications with too many inline comments.
[This was a real problem -NM]
[Especially when it went to multiple levels! -NM]
[XXXX especially when they weren't signed and talked about that
thing that you can't remember after a year]
How to change the specs now:
First, somebody writes a proposal document. It should describe the change
that should be made in detail, and give some idea of how to implement it.
Once it's fleshed out enough, it becomes a proposal.
Like an RFC, every proposal gets a number. Unlike RFCs, proposals can
change over time and keep the same number, until they are finally
accepted or rejected. The history for each proposal
will be stored in the Tor repository.
Once a proposal is in the repository, we should discuss and improve it
until we've reached consensus that it's a good idea, and that it's
detailed enough to implement. When this happens, we implement the
proposal and incorporate it into the specifications. Thus, the specs
remain the canonical documentation for the Tor protocol: no proposal is
ever the canonical documentation for an implemented feature.
(This process is pretty similar to the Python Enhancement Process, with
the major exception that Tor proposals get re-integrated into the specs
after implementation, whereas PEPs _become_ the new spec.)
{It's still okay to make small changes directly to the spec if the code
can be
written more or less immediately, or cosmetic changes if no code change is
required. This document reflects the current developers' _intent_, not
a permanent promise to always use this process in the future: we reserve
the right to get really excited and run off and implement something in a
caffeine-or-m&m-fueled all-night hacking session.}
How new proposals get added:
Once an idea has been proposed on the development list, a properly formatted
(see below) draft exists, and rough consensus within the active development
community exists that this idea warrants consideration, the proposal editor
will officially add the proposal.
To get your proposal in, send it to or-dev.
The current proposal editor is Nick Mathewson.
What should go in a proposal:
Every proposal should have a header containing these fields:
Filename, Title, Author, Created, Status.
These fields are optional but recommended:
Target, Implemented-In.
The Target field should describe which version the proposal is hoped to be
implemented in (if it's Open or Accepted). The Implemented-In field
should describe which version the proposal was implemented in (if it's
Finished or Closed).
The body of the proposal should start with an Overview section explaining
what the proposal's about, what it does, and about what state it's in.
After the Overview, the proposal becomes more free-form. Depending on its
length and complexity, the proposal can break into sections as
appropriate, or follow a short discursive format. Every proposal should
contain at least the following information before it is "ACCEPTED",
though the information does not need to be in sections with these names.
Motivation: What problem is the proposal trying to solve? Why does
this problem matter? If several approaches are possible, why take this
one?
Design: A high-level view of what the new or modified features are, how
the new or modified features work, how they interoperate with each
other, and how they interact with the rest of Tor. This is the main
body of the proposal. Some proposals will start out with only a
Motivation and a Design, and wait for a specification until the
Design seems approximately right.
Security implications: What effects the proposed changes might have on
anonymity, how well understood these effects are, and so on.
Specification: A detailed description of what needs to be added to the
Tor specifications in order to implement the proposal. This should
be in about as much detail as the specifications will eventually
contain: it should be possible for independent programmers to write
mutually compatible implementations of the proposal based on its
specifications.
Compatibility: Will versions of Tor that follow the proposal be
compatible with versions that do not? If so, how will compatibility
be achieved? Generally, we try to not drop compatibility if at
all possible; we haven't made a "flag day" change since May 2004,
and we don't want to do another one.
Implementation: If the proposal will be tricky to implement in Tor's
current architecture, the document can contain some discussion of how
to go about making it work. Actual patches should go on public git
branches, or be uploaded to trac.
Performance and scalability notes: If the feature will have an effect
on performance (in RAM, CPU, bandwidth) or scalability, there should
be some analysis on how significant this effect will be, so that we
can avoid really expensive performance regressions, and so we can
avoid wasting time on insignificant gains.
Proposal status:
Open: A proposal under discussion.
Accepted: The proposal is complete, and we intend to implement it.
After this point, substantive changes to the proposal should be
avoided, and regarded as a sign of the process having failed
somewhere.
Finished: The proposal has been accepted and implemented. After this
point, the proposal should not be changed.
Closed: The proposal has been accepted, implemented, and merged into the
main specification documents. The proposal should not be changed after
this point.
Rejected: We're not going to implement the feature as described here,
though we might do some other version. See comments in the document
for details. The proposal should not be changed after this point;
to bring up some other version of the idea, write a new proposal.
Draft: This isn't a complete proposal yet; there are definite missing
pieces. Please don't add any new proposals with this status; put them
in the "ideas" sub-directory instead.
Needs-Revision: The idea for the proposal is a good one, but the proposal
as it stands has serious problems that keep it from being accepted.
See comments in the document for details.
Dead: The proposal hasn't been touched in a long time, and it doesn't look
like anybody is going to complete it soon. It can become "Open" again
if it gets a new proponent.
Needs-Research: There are research problems that need to be solved before
it's clear whether the proposal is a good idea.
Meta: This is not a proposal, but a document about proposals.
The editor maintains the correct status of proposals, based on rough
consensus and his own discretion.
Proposal numbering:
Numbers 000-099 are reserved for special and meta-proposals. 100 and up
are used for actual proposals. Numbers aren't recycled.

View File

@ -1,107 +0,0 @@
Filename: 098-todo.txt
Title: Proposals that should be written
Author: Nick Mathewson, Roger Dingledine
Created: 26-Jan-2007
Status: Meta
Overview:
This document lists ideas that various people have had for improving the
Tor protocol. These should be implemented and specified if they're
trivial, or written up as proposals if they're not.
This is an active document, to be edited as proposals are written and as
we come up with new ideas for proposals. We should take stuff out as it
seems irrelevant.
For some later protocol version.
- It would be great to get smarter about identity and linkability.
It's not crazy to say, "Never use the same circuit for my SSH
connections and my web browsing." How far can/should we take this?
See ideas/xxx-separate-streams-by-port.txt for a start.
- Fix onionskin handshake scheme to be more mainstream, less nutty.
Can we just do
E(HMAC(g^x), g^x) rather than just E(g^x) ?
No, that has the same flaws as before. We should send
E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
Better ask Ian; probably Stephen too.
- Length on CREATE and friends
- Versioning on circuits and create cells, so we have a clear path
to improve the circuit protocol.
- SHA1 is showing its age. We should get a design for upgrading our
hash once the AHS competition is done, or even sooner.
- Not being able to upgrade ciphersuites or increase key lengths is
lame.
- Paul has some ideas about circuit creation; read his PET paper once it's
out.
Any time:
- Some ideas for revising the directory protocol:
- Extend the "r" line in network-status to give a set of buckets (say,
comma-separated) for that router.
- Buckets are deterministic based on IP address.
- Then clients can choose a bucket (or set of buckets) to
download and use.
- We need a way for the authorities to declare that nodes are in a
family. Also, it kinda sucks that family declarations use O(N^2) space
in the descriptors.
- REASON_CONNECTFAILED should include an IP.
- Spec should incorporate some prose from tor-design to be more readable.
- Spec when we should rotate which keys
- Spec how to publish descriptors less often
- Describe pros and cons of non-deterministic path lengths
- We should use a variable-length path length by default -- 3 +/- some
distribution. Need to think harder about allowing values less than 3,
and there's a tradeoff between having a wide variance and performance.
- Clients currently use certs during TLS. Is this wise? It does make it
easier for servers to tell which NATted client is which. We could use a
seprate set of certs for each guard, I suppose, but generating so many
certs could get expensive. Omitting them entirely would make OP->OR
easier to tell from OR->OR.
Things that should change...
B.1. ... but which will require backward-incompatible change
- Circuit IDs should be longer.
. IPv6 everywhere.
- Maybe, keys should be longer.
- Maybe, key-length should be adjustable. How to do this without
making anonymity suck?
- Drop backward compatibility.
- We should use a 128-bit subgroup of our DH prime.
- Handshake should use HMAC.
- Multiple cell lengths.
- Ability to split circuits across paths (If this is useful.)
- SENDME windows should be dynamic.
- Directory
- Stop ever mentioning socks ports
B.1. ... and that will require no changes
- Advertised outbound IP?
- Migrate streams across circuits.
- Fix bug 469 by limiting the number of simultaneous connections per IP.
B.2. ... and that we have no idea how to do.
- UDP (as transport)
- UDP (as content)
- Use a better AES mode that has built-in integrity checking,
doesn't grow with the number of hops, is not patented, and
is implemented and maintained by smart people.
Let onion keys be not just RSA but maybe DH too, for Paul's reply onion
design.

View File

@ -1,28 +0,0 @@
Filename: 099-misc.txt
Title: Miscellaneous proposals
Author: Various
Created: 26-Jan-2007
Status: Meta
Overview:
This document is for small proposal ideas that are about one paragraph in
length. From here, ideas can be rejected outright, expanded into full
proposals, or specified and implemented as-is.
Proposals
1. Directory compression.
Gzip would be easier to work with than zlib; bzip2 would result in smaller
data lengths. [Concretely, we're looking at about 10-15% space savings at
the expense of 3-5x longer compression time for using bzip2.] Doing
on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
Pre-compressing status documents in multiple formats would force us to use
more memory to hold them.
Status: Open
-- Nick Mathewson

View File

@ -1,422 +0,0 @@
Filename: 100-tor-spec-udp.txt
Title: Tor Unreliable Datagram Extension Proposal
Author: Marc Liberatore
Created: 23 Feb 2006
Status: Dead
Overview:
This is a modified version of the Tor specification written by Marc
Liberatore to add UDP support to Tor. For each TLS link, it adds a
corresponding DTLS link: control messages and TCP data flow over TLS, and
UDP data flows over DTLS.
This proposal is not likely to be accepted as-is; see comments at the end
of the document.
Contents
0. Introduction
Tor is a distributed overlay network designed to anonymize low-latency
TCP-based applications. The current tor specification supports only
TCP-based traffic. This limitation prevents the use of tor to anonymize
other important applications, notably voice over IP software. This document
is a proposal to extend the tor specification to support UDP traffic.
The basic design philosophy of this extension is to add support for
tunneling unreliable datagrams through tor with as few modifications to the
protocol as possible. As currently specified, tor cannot directly support
such tunneling, as connections between nodes are built using transport layer
security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
to the operation of most UDP-based application level protocols.
Thus, we propose the addition of links between nodes using datagram
transport layer security (DTLS). These links allow packets to traverse a
route through tor quickly, but their unreliable nature requires minor
changes to the tor protocol. This proposal outlines the necessary
additions and changes to the tor specification to support UDP traffic.
We note that a separate set of DTLS links between nodes creates a second
overlay, distinct from the that composed of TLS links. This separation and
resulting decrease in each anonymity set's size will make certain attacks
easier. However, it is our belief that VoIP support in tor will
dramatically increase its appeal, and correspondingly, the size of its user
base, number of deployed nodes, and total traffic relayed. These increases
should help offset the loss of anonymity that two distinct networks imply.
1. Overview of Tor-UDP and its complications
As described above, this proposal extends the Tor specification to support
UDP with as few changes as possible. Tor's overlay network is managed
through TLS based connections; we will re-use this control plane to set up
and tear down circuits that relay UDP traffic. These circuits be built atop
DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
TLS.
The unreliability of DTLS circuits creates problems for Tor at two levels:
1. Tor's encryption of the relay layer does not allow independent
decryption of individual records. If record N is not received, then
record N+1 will not decrypt correctly, as the counter for AES/CTR is
maintained implicitly.
2. Tor's end-to-end integrity checking works under the assumption that
all RELAY cells are delivered. This assumption is invalid when cells
are sent over DTLS.
The fix for the first problem is straightforward: add an explicit sequence
number to each cell. To fix the second problem, we introduce a
system of nonces and hashes to RELAY packets.
In the following sections, we mirror the layout of the Tor Protocol
Specification, presenting the necessary modifications to the Tor protocol as
a series of deltas.
2. Connections
Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
corresponding TLS links, as all control messages are sent over TLS. All
implementations MUST support the DTLS ciphersuite "[TODO]".
DTLS connections are formed using the same protocol as TLS connections.
This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
as detailed in section 4.6.
Once a paired TLS/DTLS connection is established, the two sides send cells
to one another. All but two types of cells are sent over TLS links. RELAY
cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
below, are sent over DTLS links. [Should all cells still be 512 bytes long?
Perhaps upon completion of a preliminary implementation, we should do a
performance evaluation for some class of UDP traffic, such as VoIP. - ML]
Cells may be sent embedded in TLS or DTLS records of any size or divided
across such records. The framing of these records MUST NOT leak any more
information than the above differentiation on the basis of cell type. [I am
uncomfortable with this leakage, but don't see any simple, elegant way
around it. -ML]
As with TLS connections, DTLS connections are not permanent.
3. Cell format
Each cell contains the following fields:
CircID [2 bytes]
Command [1 byte]
Sequence Number [2 bytes]
Payload (padded with 0 bytes) [507 bytes]
[Total size: 512 bytes]
The 'Command' field holds one of the following values:
0 -- PADDING (Padding) (See Sec 6.2)
1 -- CREATE (Create a circuit) (See Sec 4)
2 -- CREATED (Acknowledge create) (See Sec 4)
3 -- RELAY (End-to-end data) (See Sec 5)
4 -- DESTROY (Stop using a circuit) (See Sec 4)
5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
The sequence number allows for AES/CTR decryption of RELAY cells
independently of one another; this functionality is required to support
cells sent over DTLS. The sequence number is described in more detail in
section 4.5.
[Should the sequence number only appear in RELAY packets? The overhead is
small, and I'm hesitant to force more code paths on the implementor. -ML]
[There's already a separate relay header that has other material in it,
so it wouldn't be the end of the world to move it there if it's
appropriate. -RD]
[Having separate commands for UDP circuits seems necessary, unless we can
assume a flag day event for a large number of tor nodes. -ML]
4. Circuit management
4.2. Setting circuit keys
Keys are set up for UDP circuits in the same fashion as for TCP circuits.
Each UDP circuit shares keys with its corresponding TCP circuit.
[If the keys are used for both TCP and UDP connections, how does it
work to mix sequence-number-less cells with sequenced-numbered cells --
how do you know you have the encryption order right? -RD]
4.3. Creating circuits
UDP circuits are created as TCP circuits, using the *_UDP cells as
appropriate.
4.4. Tearing down circuits
UDP circuits are torn down as TCP circuits, using the *_UDP cells as
appropriate.
4.5. Routing relay cells
When an OR receives a RELAY cell, it checks the cell's circID and
determines whether it has a corresponding circuit along that
connection. If not, the OR drops the RELAY cell.
Otherwise, if the OR is not at the OP edge of the circuit (that is,
either an 'exit node' or a non-edge node), it de/encrypts the payload
with AES/CTR, as follows:
'Forward' relay cell (same direction as CREATE):
Use Kf as key; decrypt, using sequence number to synchronize
ciphertext and keystream.
'Back' relay cell (opposite direction from CREATE):
Use Kb as key; encrypt, using sequence number to synchronize
ciphertext and keystream.
Note that in counter mode, decrypt and encrypt are the same operation.
[Since the sequence number is only 2 bytes, what do you do when it
rolls over? -RD]
Each stream encrypted by a Kf or Kb has a corresponding unique state,
captured by a sequence number; the originator of each such stream chooses
the initial sequence number randomly, and increments it only with RELAY
cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
there's no need for counting bytes directly. Right? - ML]
[I believe this is true. You'll find out for sure when you try to
build it. ;) -RD]
The OR then decides whether it recognizes the relay cell, by
inspecting the payload as described in section 5.1 below. If the OR
recognizes the cell, it processes the contents of the relay cell.
Otherwise, it passes the decrypted relay cell along the circuit if
the circuit continues. If the OR at the end of the circuit
encounters an unrecognized relay cell, an error has occurred: the OR
sends a DESTROY cell to tear down the circuit.
When a relay cell arrives at an OP, the OP decrypts the payload
with AES/CTR as follows:
OP receives data cell:
For I=N...1,
Decrypt with Kb_I, using the sequence number as above. If the
payload is recognized (see section 5.1), then stop and process
the payload.
For more information, see section 5 below.
4.6. CREATE_UDP and CREATED_UDP cells
Users set up UDP circuits incrementally. The procedure is similar to that
for TCP circuits, as described in section 4.1. In addition to the TLS
connection to the first node, the OP also attempts to open a DTLS
connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
payload in the same format as a CREATE cell. To extend a UDP circuit past
the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
instructs the last node in the circuit to send a CREATE_UDP cell to extend
the circuit.
The relay payload for an EXTEND_UDP relay cell consists of:
Address [4 bytes]
TCP port [2 bytes]
UDP port [2 bytes]
Onion skin [186 bytes]
Identity fingerprint [20 bytes]
The address field and ports denote the IPV4 address and ports of the next OR
in the circuit.
The payload for a CREATED_UDP cell or the relay payload for an
RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
RELAY_EXTENDED cell. Both circuits are established using the same key.
Note that the existence of a UDP circuit implies the
existence of a corresponding TCP circuit, sharing keys, sequence numbers,
and any other relevant state.
4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
As above, the OP must successfully connect using DTLS before attempting to
send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
section 4.1.1.
5. Application connections and stream management
5.1. Relay cells
Within a circuit, the OP and the exit node use the contents of RELAY cells
to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
across circuits. End-to-end commands and UDP packets can be initiated by
either edge; streams are initiated by the OP.
The payload of each unencrypted RELAY cell consists of:
Relay command [1 byte]
'Recognized' [2 bytes]
StreamID [2 bytes]
Digest [4 bytes]
Length [2 bytes]
Data [498 bytes]
The relay commands are:
1 -- RELAY_BEGIN [forward]
2 -- RELAY_DATA [forward or backward]
3 -- RELAY_END [forward or backward]
4 -- RELAY_CONNECTED [backward]
5 -- RELAY_SENDME [forward or backward]
6 -- RELAY_EXTEND [forward]
7 -- RELAY_EXTENDED [backward]
8 -- RELAY_TRUNCATE [forward]
9 -- RELAY_TRUNCATED [backward]
10 -- RELAY_DROP [forward or backward]
11 -- RELAY_RESOLVE [forward]
12 -- RELAY_RESOLVED [backward]
13 -- RELAY_BEGIN_UDP [forward]
14 -- RELAY_DATA_UDP [forward or backward]
15 -- RELAY_EXTEND_UDP [forward]
16 -- RELAY_EXTENDED_UDP [backward]
17 -- RELAY_DROP_UDP [forward or backward]
Commands labelled as "forward" must only be sent by the originator
of the circuit. Commands labelled as "backward" must only be sent by
other nodes in the circuit back to the originator. Commands marked
as either can be sent either by the originator or other nodes.
The 'recognized' field in any unencrypted relay payload is always set to
zero.
The 'digest' field can have two meanings. For all cells sent over TLS
connections (that is, all commands and all non-UDP RELAY data), it is
computed as the first four bytes of the running SHA-1 digest of all the
bytes that have been sent reliably and have been destined for this hop of
the circuit or originated from this hop of the circuit, seeded from Df or Db
respectively (obtained in section 4.2 above), and including this RELAY
cell's entire payload (taken with the digest field set to zero). Cells sent
over DTLS connections do not affect this running digest. Each cell sent
over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
set to the SHA-1 digest of the current RELAY cells' entire payload, with the
digest field set to zero. Coupled with a randomly-chosen streamID, this
provides per-cell integrity checking on UDP cells.
[If you drop malformed UDP relay cells but don't close the circuit,
then this 8 bytes of digest is not as strong as what we get in the
TCP-circuit side. Is this a problem? -RD]
When the 'recognized' field of a RELAY cell is zero, and the digest
is correct, the cell is considered "recognized" for the purposes of
decryption (see section 4.5 above).
(The digest does not include any bytes from relay cells that do
not start or end at this hop of the circuit. That is, it does not
include forwarded data. Therefore if 'recognized' is zero but the
digest does not match, the running digest at that node should
not be updated, and the cell should be forwarded on.)
All RELAY cells pertaining to the same tunneled TCP stream have the
same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
cells that affect the entire circuit rather than a particular
stream use a StreamID of zero.
All RELAY cells pertaining to the same UDP tunnel have the same streamID.
This streamID is chosen randomly by the OP, but cannot be zero.
The 'Length' field of a relay cell contains the number of bytes in
the relay payload which contain real payload data. The remainder of
the payload is padded with NUL bytes.
If the RELAY cell is recognized but the relay command is not
understood, the cell must be dropped and ignored. Its contents
still count with respect to the digests, though. [Before
0.1.1.10, Tor closed circuits when it received an unknown relay
command. Perhaps this will be more forward-compatible. -RD]
5.2.1. Opening UDP tunnels and transferring data
To open a new anonymized UDP connection, the OP chooses an open
circuit to an exit that may be able to connect to the destination
address, selects a random streamID not yet used on that circuit,
and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
and port of the destination host. The payload format is:
ADDRESS | ':' | PORT | [00]
where ADDRESS can be a DNS hostname, or an IPv4 address in
dotted-quad format, or an IPv6 address surrounded by square brackets;
and where PORT is encoded in decimal.
[What is the [00] for? -NM]
[It's so the payload is easy to parse out with string funcs -RD]
Upon receiving this cell, the exit node resolves the address as necessary.
If the address cannot be resolved, the exit node replies with a RELAY_END
cell. (See 5.4 below.) Otherwise, the exit node replies with a
RELAY_CONNECTED cell, whose payload is in one of the following formats:
The IPv4 address to which the connection was made [4 octets]
A number of seconds (TTL) for which the address may be cached [4 octets]
or
Four zero-valued octets [4 octets]
An address type (6) [1 octet]
The IPv6 address to which the connection was made [16 octets]
A number of seconds (TTL) for which the address may be cached [4 octets]
[XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
field. No version of Tor currently generates the IPv6 format.]
The OP waits for a RELAY_CONNECTED cell before sending any data.
Once a connection has been established, the OP and exit node
package UDP data in RELAY_DATA_UDP cells, and upon receiving such
cells, echo their contents to the corresponding socket.
RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
a cell, the OR or OP must drop it.
5.3. Closing streams
UDP tunnels are closed in a fashion corresponding to TCP connections.
6. Flow Control
UDP streams are not subject to flow control.
7.2. Router descriptor format.
The items' formats are as follows:
"router" nickname address ORPort SocksPort DirPort UDPPort
Indicates the beginning of a router descriptor. "address" must be
an IPv4 address in dotted-quad format. The last three numbers
indicate the TCP ports at which this OR exposes
functionality. ORPort is a port at which this OR accepts TLS
connections for the main OR protocol; SocksPort is deprecated and
should always be 0; DirPort is the port at which this OR accepts
directory-related HTTP connections; and UDPPort is a port at which
this OR accepts DTLS connections for UDP data. If any port is not
supported, the value 0 is given instead of a port number.
Other sections:
What changes need to happen to each node's exit policy to support this? -RD
Switching to UDP means managing the queues of incoming packets better,
so we don't miss packets. How does this interact with doing large public
key operations (handshakes) in the same thread? -RD
========================================================================
COMMENTS
========================================================================
[16 May 2006]
I don't favor this approach; it makes packet traffic partitioned from
stream traffic end-to-end. The architecture I'd like to see is:
A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
TCP/TLS for firewall penetration or something. (This also gives us an
upgrade path for routing through legacy servers.)
B Stream traffic is handled with end-to-end per-stream acks/naks and
retries. On failure, the data is retransmitted in a new RELAY_DATA cell;
a cell isn't retransmitted.
We'll need to do A anyway, to fix our behavior on packet-loss. Once we've
done so, B is more or less inevitable, and we can support end-to-end UDP
traffic "for free".
(Also, there are some details that this draft spec doesn't address. For
example, what happens when a UDP packet doesn't fit in a single cell?)
-NM

View File

@ -1,283 +0,0 @@
Filename: 101-dir-voting.txt
Title: Voting on the Tor Directory System
Author: Nick Mathewson
Created: Nov 2006
Status: Closed
Implemented-In: 0.2.0.x
Overview
This document describes a consensus voting scheme for Tor directories;
instead of publishing different network statuses, directories would vote on
and publish a single "consensus" network status document.
This is an open proposal.
Proposal:
0. Scope and preliminaries
This document describes a consensus voting scheme for Tor directories.
Once it's accepted, it should be merged with dir-spec.txt. Some
preliminaries for authority and caching support should be done during
the 0.1.2.x series; the main deployment should come during the 0.2.0.x
series.
0.1. Goals and motivation: voting.
The current directory system relies on clients downloading separate
network status statements from the caches signed by each directory.
Clients download a new statement every 30 minutes or so, choosing to
replace the oldest statement they currently have.
This creates a partitioning problem: different clients have different
"most recent" networkstatus sources, and different versions of each
(since authorities change their statements often).
It also creates a scaling problem: most of the downloaded networkstatus
are probably quite similar, and the redundancy grows as we add more
authorities.
So if we have clients only download a single multiply signed consensus
network status statement, we can:
- Save bandwidth.
- Reduce client partitioning
- Reduce client-side and cache-side storage
- Simplify client-side voting code (by moving voting away from the
client)
We should try to do this without:
- Assuming that client-side or cache-side clocks are more correct
than we assume now.
- Assuming that authority clocks are perfectly correct.
- Degrading badly if a few authorities die or are offline for a bit.
We do not have to perform well if:
- No clique of more than half the authorities can agree about who
the authorities are.
1. The idea.
Instead of publishing a network status whenever something changes,
each authority instead publishes a fresh network status only once per
"period" (say, 60 minutes). Authorities either upload this network
status (or "vote") to every other authority, or download every other
authority's "vote" (see 3.1 below for discussion on push vs pull).
After an authority has (or has become convinced that it won't be able to
get) every other authority's vote, it deterministically computes a
consensus networkstatus, and signs it. Authorities download (or are
uploaded; see 3.1) one another's signatures, and form a multiply signed
consensus. This multiply-signed consensus is what caches cache and what
clients download.
If an authority is down, authorities vote based on what they *can*
download/get uploaded.
If an authority is "a little" down and only some authorities can reach
it, authorities try to get its info from other authorities.
If an authority computes the vote wrong, its signature isn't included on
the consensus.
Clients use a consensus if it is "trusted": signed by more than half the
authorities they recognize. If clients can't find any such consensus,
they use the most recent trusted consensus they have. If they don't
have any trusted consensus, they warn the user and refuse to operate
(and if DirServers is not the default, beg the user to adapt the list
of authorities).
2. Details.
2.0. Versioning
All documents generated here have version "3" given in their
network-status-version entries.
2.1. Vote specifications
Votes in v3 are similar to v2 network status documents. We add these
fields to the preamble:
"vote-status" -- the word "vote".
"valid-until" -- the time when this authority expects to publish its
next vote.
"known-flags" -- a space-separated list of flags that will sometimes
be included on "s" lines later in the vote.
"dir-source" -- as before, except the "hostname" part MUST be the
authority's nickname, which MUST be unique among authorities, and
MUST match the nickname in the "directory-signature" entry.
Authorities SHOULD cache their most recently generated votes so they
can persist them across restarts. Authorities SHOULD NOT generate
another document until valid-until has passed.
Router entries in the vote MUST be sorted in ascending order by router
identity digest. The flags in "s" lines MUST appear in alphabetical
order.
Votes SHOULD be synchronized to half-hour publication intervals (one
hour? XXX say more; be more precise.)
XXXX some way to request older networkstatus docs?
2.2. Consensus directory specifications
Consensuses are like v3 votes, except for the following fields:
"vote-status" -- the word "consensus".
"published" is the latest of all the published times on the votes.
"valid-until" is the earliest of all the valid-until times on the
votes.
"dir-source" and "fingerprint" and "dir-signing-key" and "contact"
are included for each authority that contributed to the vote.
"vote-digest" for each authority that contributed to the vote,
calculated as for the digest in the signature on the vote. [XXX
re-English this sentence]
"client-versions" and "server-versions" are sorted in ascending
order based on version-spec.txt.
"dir-options" and "known-flags" are not included.
[XXX really? why not list the ones that are used in the consensus?
For example, right now BadExit is in use, but no servers would be
labelled BadExit, and it's still worth knowing that it was considered
by the authorities. -RD]
The fields MUST occur in the following order:
"network-status-version"
"vote-status"
"published"
"valid-until"
For each authority, sorted in ascending order of nickname, case-
insensitively:
"dir-source", "fingerprint", "contact", "dir-signing-key",
"vote-digest".
"client-versions"
"server-versions"
The signatures at the end of the document appear as multiple instances
of directory-signature, sorted in ascending order by nickname,
case-insensitively.
A router entry should be included in the result if it is included by more
than half of the authorities (total authorities, not just those whose votes
we have). A router entry has a flag set if it is included by more than
half of the authorities who care about that flag. [XXXX this creates an
incentive for attackers to DOS authorities whose votes they don't like.
Can we remember what flags people set the last time we saw them? -NM]
[Which 'we' are we talking here? The end-users never learn which
authority sets which flags. So you're thinking the authorities
should record the last vote they saw from each authority and if it's
within a week or so, count all the flags that it advertised as 'no'
votes? Plausible. -RD]
The signature hash covers from the "network-status-version" line through
the characters "directory-signature" in the first "directory-signature"
line.
Consensus directories SHOULD be rejected if they are not signed by more
than half of the known authorities.
2.2.1. Detached signatures
Assuming full connectivity, every authority should compute and sign the
same consensus directory in each period. Therefore, it isn't necessary to
download the consensus computed by each authority; instead, the authorities
only push/fetch each others' signatures. A "detached signature" document
contains a single "consensus-digest" entry and one or more
directory-signature entries. [XXXX specify more.]
2.3. URLs and timelines
2.3.1. URLs and timeline used for agreement
An authority SHOULD publish its vote immediately at the start of each voting
period. It does this by making it available at
http://<hostname>/tor/status-vote/current/authority.z
and sending it in an HTTP POST request to each other authority at the URL
http://<hostname>/tor/post/vote
If, N minutes after the voting period has begun, an authority does not have
a current statement from another authority, the first authority retrieves
the other's statement.
Once an authority has a vote from another authority, it makes it available
at
http://<hostname>/tor/status-vote/current/<fp>.z
where <fp> is the fingerprint of the other authority's identity key.
The consensus network status, along with as many signatures as the server
currently knows, should be available at
http://<hostname>/tor/status-vote/current/consensus.z
All of the detached signatures it knows for consensus status should be
available at:
http://<hostname>/tor/status-vote/current/consensus-signatures.z
Once an authority has computed and signed a consensus network status, it
should send its detached signature to each other authority in an HTTP POST
request to the URL:
http://<hostname>/tor/post/consensus-signature
[XXXX Store votes to disk.]
2.3.2. Serving a consensus directory
Once the authority is done getting signatures on the consensus directory,
it should serve it from:
http://<hostname>/tor/status/consensus.z
Caches SHOULD download consensus directories from an authority and serve
them from the same URL.
2.3.3. Timeline and synchronization
[XXXX]
2.4. Distributing routerdescs between authorities
Consensus will be more meaningful if authorities take steps to make sure
that they all have the same set of descriptors _before_ the voting
starts. This is safe, since all descriptors are self-certified and
timestamped: it's always okay to replace a signed descriptor with a more
recent one signed by the same identity.
In the long run, we might want some kind of sophisticated process here.
For now, since authorities already download one another's networkstatus
documents and use them to determine what descriptors to download from one
another, we can rely on this existing mechanism to keep authorities up to
date.
[We should do a thorough read-through of dir-spec again to make sure
that the authorities converge on which descriptor to "prefer" for
each router. Right now the decision happens at the client, which is
no longer the right place for it. -RD]
3. Questions and concerns
3.1. Push or pull?
The URLs above define a push mechanism for publishing votes and consensus
signatures via HTTP POST requests, and a pull mechanism for downloading
these documents via HTTP GET requests. As specified, every authority will
post to every other. The "download if no copy has been received" mechanism
exists only as a fallback.
4. Migration
* It would be cool if caches could get ready to download consensus
status docs, verify enough signatures, and serve them now. That way
once stuff works all we need to do is upgrade the authorities. Caches
don't need to verify the correctness of the format so long as it's
signed (or maybe multisigned?). We need to make sure that caches back
off very quickly from downloading consensus docs until they're
actually implemented.

View File

@ -1,38 +0,0 @@
Filename: 102-drop-opt.txt
Title: Dropping "opt" from the directory format
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes a change in the format used to transmit router and
directory information.
This proposal has been accepted, implemented, and merged into dir-spec.txt.
Proposal:
The "opt" keyword in Tor's directory formats was originally intended to
mean, "it is okay to ignore this entry if you don't understand it"; the
default behavior has been "discard a routerdesc if it contains entries you
don't recognize."
But so far, every new flag we have added has been marked 'opt'. It would
probably make sense to change the default behavior to "ignore unrecognized
fields", and add the statement that clients SHOULD ignore fields they don't
recognize. As a meta-principle, we should say that clients and servers
MUST NOT have to understand new fields in order to use directory documents
correctly.
Of course, this will make it impossible to say, "The format has changed a
lot; discard this quietly if you don't understand it." We could do that by
adding a version field.
Status:
* We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it
once earlier formats are obsolete.

View File

@ -1,204 +0,0 @@
Filename: 103-multilevel-keys.txt
Title: Splitting identity key from regularly used signing key.
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes a change in the way identity keys are used, so that
highly sensitive keys can be password-protected and seldom loaded into RAM.
It presents options; it is not yet a complete proposal.
Proposal:
Replacing a directory authority's identity key in the event of a compromise
would be tremendously annoying. We'd need to tell every client to switch
their configuration, or update to a new version with an uploaded list. So
long as some weren't upgraded, they'd be at risk from whoever had
compromised the key.
With this in mind, it's a shame that our current protocol forces us to
store identity keys unencrypted in RAM. We need some kind of signing key
stored unencrypted, since we need to generate new descriptors/directories
and rotate link and onion keys regularly. (And since, of course, we can't
ask server operators to be on-hand to enter a passphrase every time we
want to rotate keys or sign a descriptor.)
The obvious solution seems to be to have a signing-only key that lives
indefinitely (months or longer) and signs descriptors and link keys, and a
separate identity key that's used to sign the signing key. Tor servers
could run in one of several modes:
1. Identity key stored encrypted. You need to pick a passphrase when
you enable this mode, and re-enter this passphrase every time you
rotate the signing key.
1'. Identity key stored separate. You save your identity key to a
floppy, and use the floppy when you need to rotate the signing key.
2. All keys stored unencrypted. In this case, we might not want to even
*have* a separate signing key. (We'll need to support no-separate-
signing-key mode anyway to keep old servers working.)
3. All keys stored encrypted. You need to enter a passphrase to start
Tor.
(Of course, we might not want to implement all of these.)
Case 1 is probably most usable and secure, if we assume that people don't
forget their passphrases or lose their floppies. We could mitigate this a
bit by encouraging people to PGP-encrypt their passphrases to themselves,
or keep a cleartext copy of their secret key secret-split into a few
pieces, or something like that.
Migration presents another difficulty, especially with the authorities. If
we use the current set of identity keys as the new identity keys, we're in
the position of having sensitive keys that have been stored on
media-of-dubious-encryption up to now. Also, we need to keep old clients
(who will expect descriptors to be signed by the identity keys they know
and love, and who will not understand signing keys) happy.
A possible solution:
One thing to consider is that router identity keys are not very sensitive:
if an OR disappears and reappears with a new key, the network treats it as
though an old router had disappeared and a new one had joined the network.
The Tor network continues unharmed; this isn't a disaster.
Thus, the ideas above are mostly relevant for authorities.
The most straightforward solution for the authorities is probably to take
advantage of the protocol transition that will come with proposal 101, and
introduce a new set of signing _and_ identity keys used only to sign votes
and consensus network-status documents. Signing and identity keys could be
delivered to users in a separate, rarely changing "keys" document, so that
the consensus network-status documents wouldn't need to include N signing
keys, N identity keys, and N certifications.
Note also that there is no reason that the identity/signing keys used by
directory authorities would necessarily have to be the same as the identity
keys those authorities use in their capacity as routers. Decoupling these
keys would give directory authorities the following set of keys:
Directory authority identity:
Highly confidential; stored encrypted and/or offline. Used to
identity directory authorities. Shipped with clients. Used to
sign Directory authority signing keys.
Directory authority signing key:
Stored online, accessible to regular Tor process. Used to sign
votes and consensus directories. Downloaded as part of a "keys"
document.
[Administrators SHOULD rotate their signing keys every month or
two, just to keep in practice and keep from forgetting the
password to the authority identity.]
V1-V2 directory authority identity:
Stored online, never changed. Used to sign legacy network-status
and directory documents.
Router identity:
Stored online, seldom changed. Used to sign server descriptors
for this authority in its role as a router. Implicitly certified
by being listed in network-status documents.
Onion key, link key:
As in tor-spec.txt
Extensions to Proposal 101.
Define a new document type, "Key certificate". It contains the
following fields, in order:
"dir-key-certificate-version": As network-status-version. Must be
"3".
"fingerprint": Hex fingerprint, with spaces, based on the directory
authority's identity key.
"dir-identity-key": The long-term identity key for this authority.
"dir-key-published": The time when this directory's signing key was
last changed.
"dir-key-expires": A time after which this key is no longer valid.
"dir-signing-key": As in proposal 101.
"dir-key-certification": A signature of the above fields, in order.
The signed material extends from the beginning of
"dir-key-certicate-version" through the newline after
"dir-key-certification". The identity key is used to generate
this signature.
These elements together constitute a "key certificate". These are
generated offline when starting a v3 authority. Private identity
keys SHOULD be stored offline, encrypted, or both. A running
authority only needs access to the signing key.
Unlike other keys currently used by Tor, the authority identity
keys and directory signing keys MAY be longer than 1024 bits.
(They SHOULD be 2048 bits or longer; they MUST NOT be shorter than
1024.)
Vote documents change as follows:
A key certificate MUST be included in-line in every vote document. With
the exception of "fingerprint", its elements MUST NOT appear in consensus
documents.
Consensus network statuses change as follows:
Remove dir-signing-key.
Change "directory-signature" to take a fingerprint of the authority's
identity key and a fingerprint of the authority's current signing key
rather than the authority's nickname.
Change "dir-source" to take the a fingerprint of the authority's
identity key rather than the authority's nickname or hostname.
Add a new document type:
A "keys" document contains all currently known key certificates.
All authorities serve it at
http://<hostname>/tor/status/keys.z
Caches and clients download the keys document whenever they receive a
consensus vote that uses a key they do not recognize. Caches download
from authorities; clients download from caches.
Processing votes:
When receiving a vote, authorities check to see if the key
certificate for the voter is different from the one they have. If
the key certificate _is_ different, and its dir-key-published is
more recent than the most recently known one, and it is
well-formed and correctly signed with the correct identity key,
then authorities remember it as the new canonical key certificate
for that voter.
A key certificate is invalid if any of the following hold:
* The version is unrecognized.
* The fingerprint does not match the identity key.
* The identity key or the signing key is ill-formed.
* The published date is very far in the past or future.
* The signature is not a valid signature of the key certificate
generated with the identity key.
When processing the signatures on consensus, clients and caches act as
follows:
1. Only consider the directory-signature entries whose identity
key hashes match trusted authorities.
2. If any such entries have signing key hashes that match unknown
signing keys, download a new keys document.
3. For every entry with a known (identity key,signing key) pair,
check the signature on the document.
4. If the document has been signed by more than half of the
authorities the client recognizes, treat the consensus as
correctly signed.
If not, but the number entries with known identity keys but
unknown signing keys might be enough to make the consensus
correctly signed, do not use the consensus, but do not discard
it until we have a new keys document.

View File

@ -1,181 +0,0 @@
Filename: 104-short-descriptors.txt
Title: Long and Short Router Descriptors
Author: Nick Mathewson
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes moving unused-by-clients information from regular
router descriptors into a new "extra info" router descriptor.
Proposal:
Some of the costliest fields in the current directory protocol are ones
that no client actually uses. In particular, the "read-history" and
"write-history" fields are used only by the authorities for monitoring the
status of the network. If we took them out, the size of a compressed list
of all the routers would fall by about 60%. (No other disposable field
would save much more than 2%.)
We propose to remove these fields from descriptors, and and have them
uploaded as a part of a separate signed "extra info" to the authorities.
This document will be signed. A hash of this document will be included in
the regular descriptors.
(We considered another design, where routers would generate and upload a
short-form and a long-form descriptor. Only the short-form descriptor would
ever be used by anybody for routing. The long-form descriptor would be
used only for analytics and other tools. We decided against this because
well-behaved tools would need to download short-form descriptors too (as
these would be the only ones indexed), and hence get redundant info. Badly
behaved tools would download only long-form descriptors, and expose
themselves to partitioning attacks.)
Other disposable fields:
Clients don't need these fields, but removing them doesn't help bandwidth
enough to be worthwhile.
contact (save about 1%)
fingerprint (save about 3%)
We could represent these fields more succinctly, but removing them would
only save 1%. (!)
reject
accept
(Apparently, exit polices are highly compressible.)
[Does size-on-disk matter to anybody? Some clients and servers don't
have much disk, or have really slow disk (e.g. USB). And we don't
store caches compressed right now. -RD]
Specification:
1. Extra Info Format.
An "extra info" descriptor contains the following fields:
"extra-info" Nickname Fingerprint
Identifies what router this is an extra info descriptor for.
Fingerprint is encoded in hex (using upper-case letters), with
no spaces.
"published" As currently documented in dir-spec.txt. It MUST match the
"published" field of the descriptor published at the same time.
"read-history"
"write-history"
As currently documented in dir-spec.txt. Optional.
"router-signature" NL Signature NL
A signature of the PKCS1-padded hash of the entire extra info
document, taken from the beginning of the "extra-info" line, through
the newline after the "router-signature" line. An extra info
document is not valid unless the signature is performed with the
identity key whose digest matches FINGERPRINT.
The "extra-info" field is required and MUST appear first. The
router-signature field is required and MUST appear last. All others are
optional. As for other documents, unrecognized fields must be ignored.
2. Existing formats
Implementations that use "read-history" and "write-history" SHOULD
continue accepting router descriptors that contain them. (Prior to
0.2.0.x, this information was encoded in ordinary router descriptors;
in any case they have always been listed as opt, so they should be
accepted anyway.)
Add these fields to router descriptors:
"extra-info-digest" Digest
"Digest" is a hex-encoded digest (using upper-case characters)
of the router's extra-info document, as signed in the router's
extra-info. (If this field is absent, no extra-info-digest
exists.)
"caches-extra-info"
Present if this router is a directory cache that provides
extra-info documents, or an authority that handles extra-info
documents.
(Since implementations before 0.1.2.5-alpha required that the "opt"
keyword precede any unrecognized entry, these keys MUST be preceded
with "opt" until 0.1.2.5-alpha is obsolete.)
3. New communications rules
Servers SHOULD generate and upload one extra-info document after each
descriptor they generate and upload; no more, no less. Servers MUST
upload the new descriptor before they upload the new extra-info.
Authorities receiving an extra-info document SHOULD verify all of the
following:
* They have a router descriptor for some server with a matching
nickname and identity fingerprint.
* That server's identity key has been used to sign the extra-info
document.
* The extra-info-digest field in the router descriptor matches
the digest of the extra-info document.
* The published fields in the two documents match.
Authorities SHOULD drop extra-info documents that do not meet these
criteria.
Extra-info documents MAY be uploaded as part of the same HTTP post as
the router descriptor, or separately. Authorities MUST accept both
methods.
Authorities SHOULD try to fetch extra-info documents from one another if
they do not have one matching the digest declared in a router
descriptor.
Caches that are running locally with a tool that needs to use extra-info
documents MAY download and store extra-info documents. They should do
so when they notice that the recommended descriptor has an
extra-info-digest not matching any extra-info document they currently
have. (Caches not running on a host that needs to use extra-info
documents SHOULD NOT download or cache them.)
4. New URLs
http://<hostname>/tor/extra/d/...
http://<hostname>/tor/extra/fp/...
http://<hostname>/tor/extra/all[.z]
(As for /tor/server/ URLs: supports fetching extra-info documents
by their digest, by the fingerprint of their servers, or all
at once. When serving by fingerprint, we serve the extra-info
that corresponds to the descriptor we would serve by that
fingerprint. Only directory authorities are guaranteed to support
these URLs.)
http://<hostname>/tor/extra/authority[.z]
(The extra-info document for this router.)
Extra-info documents are uploaded to the same URLs as regular
router descriptors.
Migration:
For extra info approach:
* First:
* Authorities should accept extra info, and support serving it.
* Routers should upload extra info once authorities accept it.
* Caches should support an option to download and cache it, once
authorities serve it.
* Tools should be updated to use locally cached information.
These tools include:
lefkada's exit.py script.
tor26's noreply script and general directory cache.
https://nighteffect.us/tns/ for its graphs
and check with or-talk for the rest, once it's time.
* Set a cutoff time for including bandwidth in router descriptors, so
that tools that use bandwidth info know that they will need to fetch
extra info documents.
* Once tools that want bandwidth info support fetching extra info:
* Have routers stop including bandwidth info in their router
descriptors.

View File

@ -1,323 +0,0 @@
Filename: 105-handshake-revision.txt
Title: Version negotiation for the Tor protocol.
Author: Nick Mathewson, Roger Dingledine
Created: Jan 2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document was extracted from a modified version of tor-spec.txt that we
had written before the proposal system went into place. It adds two new
cells types to the Tor link connection setup handshake: one used for
version negotiation, and another to prevent MITM attacks.
This proposal is partially implemented, and partially proceded by
proposal 130.
Motivation: Tor versions
Our *current* approach to versioning the Tor protocol(s) has been as
follows:
- All changes must be backward compatible.
- It's okay to add new cell types, if they would be ignored by previous
versions of Tor.
- It's okay to add new data elements to cells, if they would be
ignored by previous versions of Tor.
- For forward compatibility, Tor must ignore cell types it doesn't
recognize, and ignore data in those cells it doesn't expect.
- Clients can inspect the version of Tor declared in the platform line
of a router's descriptor, and use that to learn whether a server
supports a given feature. Servers, however, aren't assumed to all
know about each other, and so don't know the version of who they're
talking to.
This system has these problems:
- It's very hard to change fundamental aspects of the protocol, like the
cell format, the link protocol, any of the various encryption schemes,
and so on.
- The router-to-router link protocol has remained more-or-less frozen
for a long time, since we can't easily have an OR use new features
unless it knows the other OR will understand them.
We need to resolve these problems because:
- Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will
not seem like the best idea for all time.
- There are many ideas circulating for multiple cell sizes; while it's
not obvious whether these are safe, we can't do them at all without a
mechanism to permit them.
- There are many ideas circulating for alternative circuit building and
cell relay rules: they don't work unless they can coexist in the
current network.
- If our protocol changes a lot, it's hard to describe any coherent
version of it: we need to say "the version that Tor versions W through
X use when talking to versions Y through Z". This makes analysis
harder.
Motivation: Preventing MITM attacks
TLS prevents a man-in-the-middle attacker from reading or changing the
contents of a communication. It does not, however, prevent such an
attacker from observing timing information. Since timing attacks are some
of the most effective against low-latency anonymity nets like Tor, we
should take more care to make sure that we're not only talking to who
we think we're talking to, but that we're using the network path we
believe we're using.
Motivation: Signed clock information
It's very useful for Tor instances to know how skewed they are relative
to one another. The only way to find out currently has been to download
directory information, and check the Date header--but this is not
authenticated, and hence subject to modification on the wire. Using
BEGIN_DIR to create an authenticated directory stream through an existing
circuit is better, but that's an extra step and it might be nicer to
learn the information in the course of the regular protocol.
Proposal:
1.0. Version numbers
The node-to-node TLS-based "OR connection" protocol and the multi-hop
"circuit" protocol are versioned quasi-independently.
Of course, some dependencies will continue to exist: Certain versions
of the circuit protocol may require a minimum version of the connection
protocol to be used. The connection protocol affects:
- Initial connection setup, link encryption, transport guarantees,
etc.
- The allowable set of cell commands
- Allowable formats for cells.
The circuit protocol determines:
- How circuits are established and maintained
- How cells are decrypted and relayed
- How streams are established and maintained.
Version numbers are incremented for backward-incompatible protocol changes
only. Backward-compatible changes are generally implemented by adding
additional fields to existing structures; implementations MUST ignore
fields they do not expect. Unused portions of cells MUST be set to zero.
Though versioning the protocol will make it easier to maintain backward
compatibility with older versions of Tor, we will nevertheless continue to
periodically drop support for older protocols,
- to keep the implementation from growing without bound,
- to limit the maintenance burden of patching bugs in obsolete Tors,
- to limit the testing burden of verifying that many old protocol
versions continue to be implemented properly, and
- to limit the exposure of the network to protocol versions that are
expensive to support.
The Tor protocol as implemented through the 0.1.2.x Tor series will be
called "version 1" in its link protocol and "version 1" in its relay
protocol. Versions of the Tor protocol so old as to be incompatible with
Tor 0.1.2.x can be considered to be version 0 of each, and are not
supported.
2.1. VERSIONS cells
When a Tor connection is established, both parties normally send a
VERSIONS cell before sending any other cells. (But see below.)
VersionsLen [2 byte]
Versions [VersionsLen bytes]
"Versions" is a sequence of VersionsLen bytes. Each value between 1 and
127 inclusive represents a single version; current implementations MUST
ignore other bytes. Parties should list all of the versions which they
are able and willing to support. Parties can only communicate if they
have some connection protocol version in common.
Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells,
and therefore don't support version negotiation. Thus, waiting until
the other side has sent a VERSIONS cell won't work for these servers:
if the other side sends no cells back, it is impossible to tell
whether they
have sent a VERSIONS cell that has been stalled, or whether they have
dropped our own VERSIONS cell as unrecognized. Therefore, we'll
change the TLS negotiation parameters so that old parties can still
negotiate, but new parties can recognize each other. Immediately
after a TLS connection has been established, the parties check
whether the other side negotiated the connection in an "old" way or a
"new" way. If either party negotiated in the "old" way, we assume a
v1 connection. Otherwise, both parties send VERSIONS cells listing
all their supported versions. Upon receiving the other party's
VERSIONS cell, the implementation begins using the highest-valued
version common to both cells. If the first cell from the other party
has a recognized command, and is _not_ a VERSIONS cell, we assume a
v1 protocol.
(For more detail on the TLS protocol change, see forthcoming draft
proposals from Steven Murdoch.)
Implementations MUST discard VERSIONS cells that are not the first
recognized cells sent on a connection.
The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1
byte of command, 509 bytes of payload).
[NOTE: The VERSIONS cell is assigned the command number 7.]
2.2. MITM-prevention and time checking
If we negotiate a v2 connection or higher, the second cell we send SHOULD
be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other
times.
A NETINFO cell contains:
Timestamp [4 bytes]
Other OR's address [variable]
Number of addresses [1 byte]
This OR's addresses [variable]
Timestamp is the OR's current Unix time, in seconds since the epoch. If
an implementation receives time values from many ORs that
indicate that its clock is skewed, it SHOULD try to warn the
administrator. (We leave the definition of 'many' intentionally vague
for now.)
Before believing the timestamp in a NETINFO cell, implementations
SHOULD compare the time at which they received the cell to the time
when they sent their VERSIONS cell. If the difference is very large,
it is likely that the cell was delayed long enough that its
contents are out of date.
Each address contains Type/Length/Value as used in Section 6.4 of
tor-spec.txt. The first address is the one that the party sending
the NETINFO cell believes the other has -- it can be used to learn
what your IP address is if you have no other hints.
The rest of the addresses are the advertised addresses of the party
sending the NETINFO cell -- we include them
to block a man-in-the-middle attack on TLS that lets an attacker bounce
traffic through his own computers to enable timing and packet-counting
attacks.
A Tor instance should use the other Tor's reported address
information as part of logic to decide whether to treat a given
connection as suitable for extending circuits to a given address/ID
combination. When we get an extend request, we use an
existing OR connection if the ID matches, and ANY of the following
conditions hold:
- The IP matches the requested IP.
- We know that the IP we're using is canonical because it was
listed in the NETINFO cell.
- We know that the IP we're using is canonical because it was
listed in the server descriptor.
[NOTE: The NETINFO cell is assigned the command number 8.]
Discussion: Versions versus feature lists
Many protocols negotiate lists of available features instead of (or in
addition to) protocol versions. While it's possible that some amount of
feature negotiation could be supported in a later Tor, we should prefer to
use protocol versions whenever possible, for reasons discussed in
the "Anonymity Loves Company" paper.
Discussion: Bytes per version, versions per cell
This document provides for a one-byte count of how many versions a Tor
supports, and allows one byte per version. Thus, it can only support only
254 more versions of the protocol beyond the unallocated v0 and the
current v1. If we ever need to split the protocol into 255 incompatible
versions, we've probably screwed up badly somewhere.
Nevertheless, here are two ways we could support more versions:
- Change the version count to a two-byte field that counts the number of
_bytes_ used, and use a UTF8-style encoding: versions 0 through 127
take one byte to encode, versions 128 through 2047 take two bytes to
encode, and so on. We wouldn't need to parse any version higher than
127 right now, since all bytes used to encode higher versions would
have their high bit set.
We'd still have a limit of 380 simultaneously versions that could be
declared in any version. This is probably okay.
- Decide that if we need to support more versions, we can add a
MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec
above requires Tors to ignore unrecognized cell types that they get
before the first VERSIONS cell, and still allows version negotiation
to
succeed.
[Resolution: Reserve the high bit and the v0 value for later use. If
we ever have more live versions than we can fit in a cell, we've made a
bad design decision somewhere along the line.]
Discussion: Reducing round-trips
It might be appealing to see if we can cram more information in the
initial VERSIONS cell. For example, the contents of NETINFO will pretty
soon be sent by everybody before any more information is exchanged, but
decoupling them from the version exchange increases round-trips.
Instead, we could speculatively include handshaking information at
the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind
up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore
this." This could be extended to opportunistically reduce round trips
when possible for future versions when we guess the versions right.
Of course, we'd need to be careful about using a feature like this:
- We don't want to include things that are expensive to compute,
like PK signatures or proof-of-work.
- We don't want to speculate as a mobile client: it may leak our
experience with the server in question.
Discussion: Advertising versions in routerdescs and networkstatuses.
In network-statuses:
The networkstatus "v" line now has the format:
"v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST
"Circuit" CIRCUIT-VERSION-LIST NL
LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of
supported version numbers. IMPLEMENTATION is the name of the
implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the
version of the implementation.
Examples:
v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5
v OtherOR 2000+ Link 3 Circuit 5
Implementations that release independently of the Tor codebase SHOULD NOT
use "Tor" as the value of their IMPLEMENTATION.
Additional fields on the "v" line MUST be ignored.
In router descriptors:
The router descriptor should contain a line of the form,
"protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST
Additional fields on the "protocols" line MUST be ignored.
[Versions of Tor before 0.1.2.5-alpha rejected router descriptors with
unrecognized items; the protocols line should be preceded with an "opt"
until these Tors are obsolete.]
Security issues:
Client partitioning is the big danger when we introduce new versions; if a
client supports some very unusual set of protocol versions, it will stand
out from others no matter where it goes. If a server supports an unusual
version, it will get a disproportionate amount of traffic from clients who
prefer that version. We can mitigate this somewhat as follows:
- Do not have clients prefer any protocol version by default until that
version is widespread. (First introduce the new version to servers,
and have clients admit to using it only when configured to do so for
testing. Then, once many servers are running the new protocol
version, enable its use by default.)
- Do not multiply protocol versions needlessly.
- Encourage protocol implementors to implement the same protocol version
sets as some popular version of Tor.
- Disrecommend very old/unpopular versions of Tor via the directory
authorities' RecommmendedVersions mechanism, even if it is still
technically possible to use them.

View File

@ -1,111 +0,0 @@
Filename: 106-less-tls-constraint.txt
Title: Checking fewer things during TLS handshakes
Author: Nick Mathewson
Created: 9-Feb-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes that we relax our requirements on the context of
X.509 certificates during initial TLS handshakes.
Motivation:
Later, we want to try harder to avoid protocol fingerprinting attacks.
This means that we'll need to make our connection handshake look closer
to a regular HTTPS connection: one certificate on the server side and
zero certificates on the client side. For now, about the best we
can do is to stop requiring things during handshake that we don't
actually use.
What we check now, and where we check it:
tor_tls_check_lifetime:
peer has certificate
notBefore <= now <= notAfter
tor_tls_verify:
peer has at least one certificate
There is at least one certificate in the chain
At least one of the certificates in the chain is not the one used to
negotiate the connection. (The "identity cert".)
The certificate _not_ used to negotiate the connection has signed the
link cert
tor_tls_get_peer_cert_nickname:
peer has a certificate.
certificate has a subjectName.
subjectName has a commonName.
commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2]
tor_tls_peer_has_cert:
peer has a certificate.
connection_or_check_valid_handshake:
tor_tls_peer_has_cert [1]
tor_tls_get_peer_cert_nickname [1]
tor_tls_verify [1]
If nickname in cert is a known, named router, then its identity digest
must be as expected.
If we initiated the connection, then we got the identity digest we
expected.
USEFUL THINGS WE COULD DO:
[1] We could just not force clients to have any certificate at all, let alone
an identity certificate. Internally to the code, we could assign the
identity_digest field of these or_connections to a random number, or even
not add them to the identity_digest->or_conn map.
[so if somebody connects with no certs, we let them. and mark them as
a client and don't treat them as a server. great. -rd]
[2] Instead of using a restricted nickname character set that makes our
commonName structure look unlike typical SSL certificates, we could treat
the nickname as extending from the start of the commonName up to but not
including the first non-nickname character.
Alternatively, we could stop checking commonNames entirely. We don't
actually _do_ anything based on the nickname in the certificate, so
there's really no harm in letting every router have any commonName it
wants.
[this is the better choice -rd]
[agreed. -nm]
REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS:
Assuming that we removed the above requirements, we could then (in a later
release) have clients not send certificates, and sometimes and started
making our DNs a little less formulaic, client->server OR connections would
still be recognizable by:
having a two-certificate chain sent by the server
using a particular set of ciphersuites
traffic patterns
probing the server later
OTHER IMPLICATIONS:
If we stop verifying the above requirements:
It will be slightly (but only slightly) more common to connect to a non-Tor
server running TLS, and believe that you're talking to a Tor server (until
you send the first cell).
It will be far easier for non-Tor SSL clients to accidentally connect to
Tor servers and speak HTTPS or whatever to them.
If, in a later release, we have clients not send certificates, and we make
DNs less recognizable:
If clients don't send certs, servers don't need to verify them: win!
If we remove these restrictions, it will be easier for people to write
clients to fuzz our protocol: sorta win!
If clients don't send certs, they look slightly less like servers.
OTHER SPEC CHANGES:
When a client doesn't give us an identity, we should never extend any
circuits to it (duh), and we should allow it to set circuit ID however it
wants.

View File

@ -1,54 +0,0 @@
Filename: 107-uptime-sanity-checking.txt
Title: Uptime Sanity Checking
Author: Kevin Bauer & Damon McCoy
Created: 8-March-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document describes how to cap the uptime that is used when computing
which routers are marked as stable such that highly stable routers cannot
be displaced by malicious routers that report extremely high uptime
values.
This is similar to how bandwidth is capped at 1.5MB/s.
Motivation:
It has been pointed out that an attacker can displace all stable nodes and
entry guard nodes by reporting high uptimes. This is an easy fix that will
prevent highly stable nodes from being displaced.
Security implications:
It should decrease the effectiveness of routing attacks that report high
uptimes while not impacting the normal routing algorithms.
Specification:
So we could patch Section 3.1 of dir-spec.txt to say:
"Stable" -- A router is 'Stable' if it is running, valid, not
hibernating, and either its uptime is at least the median uptime for
known running, valid, non-hibernating routers, or its uptime is at
least 30 days. Routers are never called stable if they are running
a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha
through 0.1.1.16-rc are stupid this way.)
Compatibility:
There should be no compatibility issues due to uptime capping.
Implementation:
Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788).
Discussion:
Initially, this proposal set the maximum at 60 days, not 30; the 30 day
limit and spec wording was suggested by Roger in an or-dev post on 9 March
2007.
This proposal also led to 108-mtbf-based-stability.txt

View File

@ -1,88 +0,0 @@
Filename: 108-mtbf-based-stability.txt
Title: Base "Stable" Flag on Mean Time Between Failures
Author: Nick Mathewson
Created: 10-Mar-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document proposes that we change how directory authorities set the
stability flag from inspection of a router's declared Uptime to the
authorities' perceived mean time between failure for the router.
Motivation:
Clients prefer nodes that the authorities call Stable. This flag is (as
of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for
uptime. This creates an opportunity for malicious nodes to declare
falsely high uptimes in order to get more traffic.
Spec changes:
Replace the current rule for setting the Stable flag with:
"Stable" -- A router is 'Stable' if it is active and its observed Stability
for the past month is at or above the median Stability for active routers.
Routers are never called stable if they are running a version of Tor
known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc
are stupid this way.)
Stability shall be defined as the weighted mean length of the runs
observed by a given directory authority. A run begins when an authority
decides that the server is Running, and ends when the authority decides
that the server is not Running. In-progress runs are counted when
measuring Stability. When calculating the mean, runs are weighted by
$\alpha ^ t$, where $t$ is time elapsed since the end of the run, and
$0 < \alpha < 1$. Time when an authority is down do not count to the
length of the run.
Rejected Alternative:
"A router's Stability shall be defined as the sum of $\alpha ^ d$ for every
$d$ such that the router was considered reachable for the entire day
$d$ days ago.
This allows a simpler implementation: every day, we multiply
yesterday's Stability by alpha, and if the router was observed to be
available every time we looked today, we add 1.
Instead of "day", we could pick an arbitrary time unit. We should
pick alpha to be high enough that long-term stability counts, but low
enough that the distant past is eventually forgotten. Something
between .8 and .95 seems right.
(By requiring that routers be up for an entire day to get their
stability increased, instead of counting fractions of a day, we
capture the notion that stability is more like "probability of
staying up for the next hour" than it is like "probability of being
up at some randomly chosen time over the next hour." The former
notion of stability is far more relevant for long-lived circuits.)
Limitations:
Authorities can have false positives and false negatives when trying to
tell whether a router is up or down. So long as these aren't terribly
wrong, and so long as they aren't significantly biased, we should be able
to use them to estimate stability pretty well.
Probing approaches like the above could miss short incidents of
downtime. If we use the router's declared uptime, we could detect
these: but doing so would penalize routers who reported their uptime
accurately.
Implementation:
For now, the easiest way to store this information at authorities
would probably be in some kind of periodically flushed flat file.
Later, we could move to Berkeley db or something if we really had to.
For each router, an authority will need to store:
The router ID.
Whether the router is up.
The time when the current run started, if the router is up.
The weighted sum length of all previous runs.
The time at which the weighted sum length was last weighted down.
Servers should probe at random intervals to test whether servers are
running.

View File

@ -1,90 +0,0 @@
Filename: 109-no-sharing-ips.txt
Title: No more than one server per IP address.
Author: Kevin Bauer & Damon McCoy
Created: 9-March-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document describes a solution to a Sybil attack vulnerability in the
directory servers. Currently, it is possible for a single IP address to
host an arbitrarily high number of Tor routers. We propose that the
directory servers limit the number of Tor routers that may be registered at
a particular IP address to some small (fixed) number, perhaps just one Tor
router per IP address.
While Tor never uses more than one server from a given /16 in the same
circuit, an attacker with multiple servers in the same place is still
dangerous because he can get around the per-server bandwidth cap that is
designed to prevent a single server from attracting too much of the overall
traffic.
Motivation:
Since it is possible for an attacker to register an arbitrarily large
number of Tor routers, it is possible for malicious parties to do this
as part of a traffic analysis attack.
Security implications:
This countermeasure will increase the number of IP addresses that an
attacker must control in order to carry out traffic analysis.
Specification:
For each IP address, each directory authority tracks the number of routers
using that IP address, along with their total observed bandwidth. If there
are more than MAX_SERVERS_PER_IP servers at some IP, the authority should
"disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers
to disable, the authority should first disable non-Running servers in
increasing order of observed bandwidth, and then should disable Running
servers in increasing order of bandwidth.
[[ We don't actually do this part here. -NM
If the total observed
bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP,
the authority should "disable" some of the remaining servers until only one
server remains, or until the remaining observed bandwidth of non-"disabled"
servers is under MAX_BW_PER_IP.
]]
Servers that are "disabled" MUST be marked as non-Valid and non-Running.
MAX_SERVERS_PER_IP is 3.
MAX_BW_PER_IP is 8 MB per s.
Compatibility:
Upon inspection of a directory server, we found that the following IP
addresses have more than one Tor router:
Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443
WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001
sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001
moria1 18.244.0.188 moria.mit.edu 9001
peacetime 18.244.0.188 moria.mit.edu 9100
There may exist compatibility issues with this proposed fix. Reasons why
more than one server would share an IP address include:
* Testing. moria1, moria2, peacetime, and other morias all run on one
computer at MIT, because that way we get testing. Moria1 and moria2 are
run by Roger, and peacetime is run by Nick.
* NAT. If there are several servers but they port-forward through the same
IP address, ... we can hope that the operators coordinate with each
other. Also, we should recognize that while they help the network in
terms of increased capacity, they don't help as much as they could in
terms of location diversity. But our approach so far has been to take
what we can get.
* People who have more than 1.5MB/s and want to help out more. For
example, for a while Tonga was offering 10MB/s and its Tor server
would only make use of a bit of it. So Roger suggested that he run
two Tor servers, to use more.
[Note Roger's tweak to this behavior, in
http://archives.seul.org/or/cvs/Oct-2007/msg00118.html]

View File

@ -1,120 +0,0 @@
Filename: 110-avoid-infinite-circuits.txt
Title: Avoiding infinite length circuits
Author: Roger Dingledine
Created: 13-Mar-2007
Status: Accepted
Target: 0.2.1.x
Implemented-In: 0.2.1.3-alpha
History:
Revised 28 July 2008 by nickm: set K.
Revised 3 July 2008 by nickm: rename from relay_extend to
relay_early. Revise to current migration plan. Allow K cells
over circuit lifetime, not just at start.
Overview:
Right now, an attacker can add load to the Tor network by extending a
circuit an arbitrary number of times. Every cell that goes down the
circuit then adds N times that amount of load in overall bandwidth
use. This vulnerability arises because servers don't know their position
on the path, so they can't tell how many nodes there are before them
on the path.
We propose a new set of relay cells that are distinguishable by
intermediate hops as permitting extend cells. This approach will allow
us to put an upper bound on circuit length relative to the number of
colluding adversary nodes; but there are some downsides too.
Motivation:
The above attack can be used to generally increase load all across the
network, or it can be used to target specific servers: by building a
circuit back and forth between two victim servers, even a low-bandwidth
attacker can soak up all the bandwidth offered by the fastest Tor
servers.
The general attacks could be used as a demonstration that Tor isn't
perfect (leading to yet more media articles about "breaking" Tor), and
the targetted attacks will come into play once we have a reputation
system -- it will be trivial to DoS a server so it can't pass its
reputation checks, in turn impacting security.
Design:
We should split RELAY cells into two types: RELAY and RELAY_EARLY.
Only K (say, 10) Relay_early cells can be sent across a circuit, and
only relay_early cells are allowed to contain extend requests. We
still support obscuring the length of the circuit (if more research
shows us what to do), because Alice can choose how many of the K to
mark as relay_early. Note that relay_early cells *can* contain any
sort of data cell; so in effect it's actually the relay type cells
that are restricted. By default, she would just send the first K
data cells over the stream as relay_early cells, regardless of their
actual type.
(Note that a circuit that is out of relay_early cells MUST NOT be
cannibalized later, since it can't extend. Note also that it's always okay
to use regular RELAY cells when sending non-EXTEND commands targetted at
the first hop of a circuit, since there is no intermediate hop to try to
learn the relay command type.)
Each intermediate server would pass on the same type of cell that it
received (either relay or relay_early), and the cell's destination
will be able to learn whether it's allowed to contain an Extend request.
If an intermediate server receives more than K relay_early cells, or
if it sees a relay cell that contains an extend request, then it
tears down the circuit (protocol violation).
Security implications:
The upside is that this limits the bandwidth amplification factor to
K: for an individual circuit to become arbitrary-length, the attacker
would need an adversary-controlled node every K hops, and at that
point the attack is no worse than if the attacker creates N/K separate
K-hop circuits.
On the other hand, we want to pick a large enough value of K that we
don't mind the cap.
If we ever want to take steps to hide the number of hops in the circuit
or a node's position in the circuit, this design probably makes that
more complex.
Migration:
In 0.2.0, servers speaking v2 or later of the link protocol accept
RELAY_EARLY cells, and pass them on. If the next OR in the circuit
is not speaking the v2 link protocol, the server relays the cell as
a RELAY cell.
In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2
connections. This functionality can be safely backported to
0.2.0.x. Clients should pick a random number betweeen (say) K and
K-2 to send.
In 0.2.1.3-alpha, servers close any circuit in which more than K
relay_early cells are sent.
Once all versions the do not send RELAY_EARLY cells are obsolete,
servers can begin to reject any EXTEND requests not sent in a
RELAY_EARLY cell.
Parameters:
Let K = 8, for no terribly good reason.
Spec:
[We can formalize this part once we think the design is a good one.]
Acknowledgements:
This design has been kicking around since Christian Grothoff and I came
up with it at PET 2004. (Nathan Evans, Christian Grothoff's student,
is working on implementing a fix based on this design in the summer
2007 timeframe.)

View File

@ -1,151 +0,0 @@
Filename: 111-local-traffic-priority.txt
Title: Prioritizing local traffic over relayed traffic
Author: Roger Dingledine
Created: 14-Mar-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
We describe some ways to let Tor users operate as a relay and enforce
rate limiting for relayed traffic without impacting their locally
initiated traffic.
Motivation:
Right now we encourage people who use Tor as a client to configure it
as a relay too ("just click the button in Vidalia"). Most of these users
are on asymmetric links, meaning they have a lot more download capacity
than upload capacity. But if they enable rate limiting too, suddenly
they're limited to the same download capacity as upload capacity. And
they have to enable rate limiting, or their upstream pipe gets filled
up, starts dropping packets, and now their net connection doesn't work
even for non-Tor stuff. So they end up turning off the relaying part
so they can use Tor (and other applications) again.
So far this hasn't mattered that much: most of our fast relays are
being operated only in relay mode, so the rate limiting makes sense
for them. But if we want to be able to attract many more relays in
the future, we need to let ordinary users act as relays too.
Further, as we begin to deploy the blocking-resistance design and we
rely on ordinary users to click the "Tor for Freedom" button, this
limitation will become a serious stumbling block to getting volunteers
to act as bridges.
The problem:
Tor implements its rate limiting on the 'read' side by only reading
a certain number of bytes from the network in each second. If it has
emptied its token bucket, it doesn't read any more from the network;
eventually TCP notices and stalls until we resume reading. But if we
want to have two classes of service, we can't know what class a given
incoming cell will be until we look at it, at which point we've already
read it.
Some options:
Option 1: read when our token bucket is full enough, and if it turns
out that what we read was local traffic, then add the tokens back into
the token bucket. This will work when local traffic load alternates
with relayed traffic load; but it's a poor option in general, because
when we're receiving both local and relayed traffic, there are plenty
of cases where we'll end up with an empty token bucket, and then we're
back where we were before.
More generally, notice that our problem is easy when a given TCP
connection either has entirely local circuits or entirely relayed
circuits. In fact, even if they are both present, if one class is
entirely idle (none of its circuits have sent or received in the past
N seconds), we can ignore that class until it wakes up again. So it
only gets complex when a single connection contains active circuits
of both classes.
Next, notice that local traffic uses only the entry guards, whereas
relayed traffic likely doesn't. So if we're a bridge handling just
a few users, the expected number of overlapping connections would be
almost zero, and even if we're a full relay the number of overlapping
connections will be quite small.
Option 2: build separate TCP connections for local traffic and for
relayed traffic. In practice this will actually only require a few
extra TCP connections: we would only need redundant TCP connections
to at most the number of entry guards in use.
However, this approach has some drawbacks. First, if the remote side
wants to extend a circuit to you, how does it know which TCP connection
to send it on? We would need some extra scheme to label some connections
"client-only" during construction. Perhaps we could do this by seeing
whether any circuit was made via CREATE_FAST; but this still opens
up a race condition where the other side sends a create request
immediately. The only ways I can imagine to avoid the race entirely
are to specify our preference in the VERSIONS cell, or to add some
sort of "nope, not this connection, why don't you try another rather
than failing" response to create cells, or to forbid create cells on
connections that you didn't initiate and on which you haven't seen
any circuit creation requests yet -- this last one would lead to a bit
more connection bloat but doesn't seem so bad. And we already accept
this race for the case where directory authorities establish new TCP
connections periodically to check reachability, and then hope to hang
up on them soon after. (In any case this issue is moot for bridges,
since each destination will be one-way with respect to extend requests:
either receiving extend requests from bridge users or sending extend
requests to the Tor server, never both.)
The second problem with option 2 is that using two TCP connections
reveals that there are two classes of traffic (and probably quickly
reveals which is which, based on throughput). Now, it's unclear whether
this information is already available to the other relay -- he would
easily be able to tell that some circuits are fast and some are rate
limited, after all -- but it would be nice to not add even more ways to
leak that information. Also, it's less clear that an external observer
already has this information if the circuits are all bundled together,
and for this case it's worth trying to protect it.
Option 3: tell the other side about our rate limiting rules. When we
establish the TCP connection, specify the different policy classes we
have configured. Each time we extend a circuit, specify which policy
class that circuit should be part of. Then hope the other side obeys
our wishes. (If he doesn't, hang up on him.) Besides the design and
coordination hassles involved in this approach, there's a big problem:
our rate limiting classes apply to all our connections, not just
pairwise connections. How does one server we're connected to know how
much of our bucket has already been spent by another? I could imagine
a complex and inefficient "ok, now you can send me those two more cells
that you've got queued" protocol. I'm not sure how else we could do it.
(Gosh. How could UDP designs possibly be compatible with rate limiting
with multiple bucket sizes?)
Option 4: put both classes of circuits over a single connection, and
keep track of the last time we read or wrote a high-priority cell. If
it's been less than N seconds, give the whole connection high priority,
else give the whole connection low priority.
Option 5: put both classes of circuits over a single connection, and
play a complex juggling game by periodically telling the remote side
what rate limits to set for that connection, so you end up giving
priority to the right connections but still stick to roughly your
intended bandwidthrate and relaybandwidthrate.
Option 6: ?
Prognosis:
Nick really didn't like option 2 because of the partitioning questions.
I've put option 4 into place as of Tor 0.2.0.3-alpha.
In terms of implementation, it will be easy: just add a time_t to
or_connection_t that specifies client_used (used by the initiator
of the connection to rate limit it differently depending on how
recently the time_t was reset). We currently update client_used
in three places:
- command_process_relay_cell() when we receive a relay cell for
an origin circuit.
- relay_send_command_from_edge() when we send a relay cell for
an origin circuit.
- circuit_deliver_create_cell() when send a create cell.
We could probably remove the third case and it would still work,
but hey.

View File

@ -1,163 +0,0 @@
Filename: 112-bring-back-pathlencoinweight.txt
Title: Bring Back Pathlen Coin Weight
Author: Mike Perry
Created:
Status: Superseded
Superseded-By: 115
Overview:
The idea is that users should be able to choose a weight which
probabilistically chooses their path lengths to be 2 or 3 hops. This
weight will essentially be a biased coin that indicates an
additional hop (beyond 2) with probability P. The user should be
allowed to choose 0 for this weight to always get 2 hops and 1 to
always get 3.
This value should be modifiable from the controller, and should be
available from Vidalia.
Motivation:
The Tor network is slow and overloaded. Increasingly often I hear
stories about friends and friends of friends who are behind firewalls,
annoying censorware, or under surveillance that interferes with their
productivity and Internet usage, or chills their speech. These people
know about Tor, but they choose to put up with the censorship because
Tor is too slow to be usable for them. In fact, to download a fresh,
complete copy of levine-timing.pdf for the Anonymity Implications
section of this proposal over Tor took me 3 tries.
There are many ways to improve the speed problem, and of course we
should and will implement as many as we can. Johannes's GSoC project
and my reputation system are longer term, higher-effort things that
will still provide benefit independent of this proposal.
However, reducing the path length to 2 for those who do not need the
(questionable) extra anonymity 3 hops provide not only improves
their Tor experience but also reduces their load on the Tor network by
33%, and can be done in less than 10 lines of code. That's not just
Win-Win, it's Win-Win-Win.
Furthermore, when blocking resistance measures insert an extra relay
hop into the equation, 4 hops will certainly be completely unusable
for these users, especially since it will be considerably more
difficult to balance the load across a dark relay net than balancing
the load on Tor itself (which today is still not without its flaws).
Anonymity Implications:
It has long been established that timing attacks against mixed
networks are extremely effective, and that regardless of path
length, if the adversary has compromised your first and last
hop of your path, you can assume they have compromised your
identity for that connection.
In [1], it is demonstrated that for all but the slowest, lossiest
networks, error rates for false positives and false negatives were
very near zero. Only for constant streams of traffic over slow and
(more importantly) extremely lossy network links did the error rate
hit 20%. For loss rates typical to the Internet, even the error rate
for slow nodes with constant traffic streams was 13%.
When you take into account that most Tor streams are not constant,
but probably much more like their "HomeIP" dataset, which consists
mostly of web traffic that exists over finite intervals at specific
times, error rates drop to fractions of 1%, even for the "worst"
network nodes.
Therefore, the user has little benefit from the extra hop, assuming
the adversary does timing correlation on their nodes. The real
protection is the probability of getting both the first and last hop,
and this is constant whether the client chooses 2 hops, 3 hops, or 42.
Partitioning attacks form another concern. Since Tor uses telescoping
to build circuits, it is possible to tell a user is constructing only
two hop paths at the entry node. It is questionable if this data is
actually worth anything though, especially if the majority of users
have easy access to this option, and do actually choose their path
lengths semi-randomly.
Nick has postulated that exits may also be able to tell that you are
using only 2 hops by the amount of time between sending their
RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
see from the OP. I doubt that they will be able to make much use
of this timing pattern, since it will likely vary widely depending
upon the type of node selected for that first hop, and the user's
connection rate to that first hop. It is also questionable if this
data is worth anything, especially if many users are using this
option (and I imagine many will).
Perhaps most seriously, two hop paths do allow malicious guards
to easily fail circuits if they do not extend to their colluding peers
for the exit hop. Since guards can detect the number of hops in a
path, they could always fail the 3 hop circuits and focus on
selectively failing the two hop ones until a peer was chosen.
I believe currently guards are rotated if circuits fail, which does
provide some protection, but this could be changed so that an entry
guard is completely abandoned after a certain ratio of extend or
general circuit failures with respect to non-failed circuits. This
could possibly be gamed to increase guard turnover, but such a game
would be much more noticeable than an individual guard failing circuits,
though, since it would affect all clients, not just those who chose
a particular guard.
Why not fix Pathlen=2?:
The main reason I am not advocating that we always use 2 hops is that
in some situations, timing correlation evidence by itself may not be
considered as solid and convincing as an actual, uninterrupted, fully
traced path. Are these timing attacks as effective on a real network
as they are in simulation? Would an extralegal adversary or authoritarian
government even care? In the face of these situation-dependent unknowns,
it should be up to the user to decide if this is a concern for them or not.
It should probably also be noted that even a false positive
rate of 1% for a 200k concurrent-user network could mean that for a
given node, a given stream could be confused with something like 10
users, assuming ~200 nodes carry most of the traffic (ie 1000 users
each). Though of course to really know for sure, someone needs to do
an attack on a real network, unfortunately.
Implementation:
new_route_len() can be modified directly with a check of the
PathlenCoinWeight option (converted to percent) and a call to
crypto_rand_int(0,100) for the weighted coin.
The entry_guard_t structure could have num_circ_failed and
num_circ_succeeded members such that if it exceeds N% circuit
extend failure rate to a second hop, it is removed from the entry list.
N should be sufficiently high to avoid churn from normal Tor circuit
failure as determined by TorFlow scans.
The Vidalia option should be presented as a boolean, to minimize confusion
for the user. Something like a radiobutton with:
* "I use Tor for Censorship Resistance, not Anonymity. Speed is more
important to me than Anonymity."
* "I use Tor for Anonymity. I need extra protection at the cost of speed."
and then some explanation in the help for exactly what this means, and
the risks involved with eliminating the adversary's need for timing attacks
wrt to false positives, etc.
Migration:
Phase one: Experiment with the proper ratio of circuit failures
used to expire garbage or malicious guards via TorFlow.
Phase two: Re-enable config and modify new_route_len() to add an
extra hop if coin comes up "heads".
Phase three: Make radiobutton in Vidalia, along with help entry
that explains in layman's terms the risks involved.
[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf

View File

@ -1,85 +0,0 @@
Filename: 113-fast-authority-interface.txt
Title: Simplifying directory authority administration
Author: Nick Mathewson
Created:
Status: Superseded
Overview
The problem:
Administering a directory authority is a pain: you need to go through
emails and manually add new nodes as "named". When bad things come up,
you need to mark nodes (or whole regions) as invalid, badexit, etc.
This means that mostly, authority admins don't: only 2/4 current authority
admins actually bind names or list bad exits, and those two have often
complained about how annoying it is to do so.
Worse, name binding is a common path, but it's a pain in the neck: nobody
has done it for a couple of months.
Digression: who knows what?
It's trivial for Tor to automatically keep track of all of the
following information about a server:
name, fingerprint, IP, last-seen time, first-seen time, declared
contact.
All we need to have the administrator set is:
- Is this name/fingerprint pair bound?
- Is this fingerprint/IP a bad exit?
- Is this fingerprint/IP an invalid node?
- Is this fingerprint/IP to be rejected?
The workflow for authority admins has two parts:
- Periodically, go through tor-ops and add new names. This doesn't
need to be done urgently.
- Less often, mark badly behaved serves as badly behaved. This is more
urgent.
Possible solution #1: Web-interface for name binding.
Deprecate use of the tor-ops mailing list; instead, have operators go to a
webform and enter their server info. This would put the information in a
standardized format, thus allowing quick, nearly-automated approval and
reply.
Possible solution #2: Self-binding names.
Peter Palfrader has proposed that names be assigned automatically to nodes
that have been up and running and valid for a while.
Possible solution #3: Self-maintaining approved-routers file
Mixminion alpha has a neat feature where whenever a new server is seen,
a stub line gets added to a configuration file. For Tor, it could look
something like this:
## First seen with this key on 2007-04-21 13:13:14
## Stayed up for at least 12 hours on IP 192.168.10.10
#RouterName AAAABBBBCCCCDDDDEFEF
(Note that the implementation needs to parse commented lines to make sure
that it doesn't add duplicates, but that's not so hard.)
To add a router as named, administrators would only need to uncomment the
entry. This automatically maintained file could be kept separately from a
manually maintained one.
This could be combined with solution #2, such that Tor would do the hard
work of uncommenting entries for routers that should get Named, but
operators could override its decisions.
Possible solution #4: A separate mailing list for authority operators.
Right now, the tor-ops list is very high volume. There should be another
list that's only for dealing with problems that need prompt action, like
marking a router as !badexit.
Resolution:
Solution #2 is described in "Proposal 123: Naming authorities
automatically create bindings", and that approach is implemented.
There are remaining issues in the problem statement above that need
their own solutions.

View File

@ -1,439 +0,0 @@
Filename: 114-distributed-storage.txt
Title: Distributed Storage for Tor Hidden Service Descriptors
Author: Karsten Loesing
Created: 13-May-2007
Status: Closed
Implemented-In: 0.2.0.x
Change history:
13-May-2007 Initial proposal
14-May-2007 Added changes suggested by Lasse Øverlier
30-May-2007 Changed descriptor format, key length discussion, typos
09-Jul-2007 Incorporated suggestions by Roger, added status of specification
and implementation for upcoming GSoC mid-term evaluation
11-Aug-2007 Updated implementation statuses, included non-consecutive
replication to descriptor format
20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2
02-Dec-2007 Closed proposal
Overview:
The basic idea of this proposal is to distribute the tasks of storing and
serving hidden service descriptors from currently three authoritative
directory nodes among a large subset of all onion routers. The three
reasons to do this are better robustness (availability), better
scalability, and improved security properties. Further,
this proposal suggests changes to the hidden service descriptor format to
prevent new security threats coming from decentralization and to gain even
better security properties.
Status:
As of December 2007, the new hidden service descriptor format is implemented
and usable. However, servers and clients do not yet make use of descriptor
cookies, because there are open usability issues of this feature that might
be resolved in proposal 121. Further, hidden service directories do not
perform replication by themselves, because (unauthorized) replica fetch
requests would allow any attacker to fetch all hidden service descriptors in
the system. As neither issue is critical to the functioning of v2
descriptors and their distribution, this proposal is considered as Closed.
Motivation:
The current design of hidden services exhibits the following performance and
security problems:
First, the three hidden service authoritative directories constitute a
performance bottleneck in the system. The directory nodes are responsible for
storing and serving all hidden service descriptors. As of May 2007 there are
about 1000 descriptors at a time, but this number is assumed to increase in
the future. Further, there is no replication protocol for descriptors between
the three directory nodes, so that hidden services must ensure the
availability of their descriptors by manually publishing them on all
directory nodes. Whenever a fourth or fifth hidden service authoritative
directory is added, hidden services will need to maintain an equally
increasing number of replicas. These scalability issues have an impact on the
current usage of hidden services and put an even higher burden on the
development of new kinds of applications for hidden services that might
require storing even more descriptors.
Second, besides posing a limitation to scalability, storing all hidden
service descriptors on three directory nodes also constitutes a security
risk. The directory node operators could easily analyze the publish and fetch
requests to derive information on service activity and usage and read the
descriptor contents to determine which onion routers work as introduction
points for a given hidden service and need to be attacked or threatened to
shut it down. Furthermore, the contents of a hidden service descriptor offer
only minimal security properties to the hidden service. Whoever gets aware of
the service ID can easily find out whether the service is active at the
moment and which introduction points it has. This applies to (former)
clients, (former) introduction points, and of course to the directory nodes.
It requires only to request the descriptor for the given service ID, which
can be performed by anyone anonymously.
This proposal suggests two major changes to approach the described
performance and security problems:
The first change affects the storage location for hidden service descriptors.
Descriptors are distributed among a large subset of all onion routers instead
of three fixed directory nodes. Each storing node is responsible for a subset
of descriptors for a limited time only. It is not able to choose which
descriptors it stores at a certain time, because this is determined by its
onion ID which is hard to change frequently and in time (only routers which
are stable for a given time are accepted as storing nodes). In order to
resist single node failures and untrustworthy nodes, descriptors are
replicated among a certain number of storing nodes. A first replication
protocol makes sure that descriptors don't get lost when the node population
changes; therefore, a storing node periodically requests the descriptors from
its siblings. A second replication protocol distributes descriptors among
non-consecutive nodes of the ID ring to prevent a group of adversaries from
generating new onion keys until they have consecutive IDs to create a 'black
hole' in the ring and make random services unavailable. Connections to
storing nodes are established by extending existing circuits by one hop to
the storing node. This also ensures that contents are encrypted. The effect
of this first change is that the probability that a single node operator
learns about a certain hidden service is very small and that it is very hard
to track a service over time, even when it collaborates with other node
operators.
The second change concerns the content of hidden service descriptors.
Obviously, security problems cannot be solved only by decentralizing storage;
in fact, they could also get worse if done without caution. At first, a
descriptor ID needs to change periodically in order to be stored on changing
nodes over time. Next, the descriptor ID needs to be computable only for the
service's clients, but should be unpredictable for all other nodes. Further,
the storing node needs to be able to verify that the hidden service is the
true originator of the descriptor with the given ID even though it is not a
client. Finally, a storing node should learn as little information as
necessary by storing a descriptor, because it might not be as trustworthy as
a directory node; for example it does not need to know the list of
introduction points. Therefore, a second key is applied that is only known to
the hidden service provider and its clients and that is not included in the
descriptor. It is used to calculate descriptor IDs and to encrypt the
introduction points. This second key can either be given to all clients
together with the hidden service ID, or to a group or a single client as
an authentication token. In the future this second key could be the result of
some key agreement protocol between the hidden service and one or more
clients. A new text-based format is proposed for descriptors instead of an
extension of the existing binary format for reasons of future extensibility.
Design:
The proposed design is described by the required changes to the current
design. These requirements are grouped by content, rather than by affected
specification documents or code files, and numbered for reference below.
Hidden service clients, servers, and directories:
/1/ Create routing list
All participants can filter the consensus status document received from the
directory authorities to one routing list containing only those servers
that store and serve hidden service descriptors and which are running for
at least 24 hours. A participant only trusts its own routing list and never
learns about routing information from other parties.
/2/ Determine responsible hidden service directory
All participants can determine the hidden service directory that is
responsible for storing and serving a given ID, as well as the hidden
service directories that replicate its content. Every hidden service
directory is responsible for the descriptor IDs in the interval from
its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
service directory holds replicas for its n predecessors, where n denotes
the number of consecutive replicas. (requires /1/)
[/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
requests which have not been fulfilled in the course of the implementation
of this proposal, but elsewhere.]
Hidden service directory nodes:
/5/ Advertise hidden service directory functionality
Every onion router that has its directory port open can decide whether it
wants to store and serve hidden service descriptors by setting a new config
option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
option being set includes the flag "hidden-service-dir" in its router
descriptors that it sends to directory authorities.
/6/ Accept v2 publish requests, parse and store v2 descriptors
Hidden service directory nodes accept publish requests for hidden service
descriptors and store them to their local memory. (It is not necessary to
make descriptors persistent, because after disconnecting, the onion router
would not be accepted as storing node anyway, because it has not been
running for at least 24 hours.) All requests and replies are formatted as
HTTP messages. Requests are directed to the router's directory port and are
contained within BEGIN_DIR cells. A hidden service directory node stores a
descriptor only when it thinks that it is responsible for storing that
descriptor based on its own routing table. Every hidden service directory
node is responsible for the descriptor IDs in the interval of its n-th
predecessor in the ID circle up to its own ID (n denotes the number of
consecutive replicas). (requires /1/)
/7/ Accept v2 fetch requests
Same as /6/, but with fetch requests for hidden service descriptors.
(requires /2/)
/8/ Replicate descriptors with neighbors
A hidden service directory node replicates descriptors from its two
predecessors by downloading them once an hour. Further, it checks its
routing table periodically for changes. Whenever it realizes that a
predecessor has left the network, it establishes a connection to the new
n-th predecessor and requests its stored descriptors in the interval of its
(n+1)-th predecessor and the requested n-th predecessor. Whenever it
realizes that a new onion router has joined with an ID higher than its
former n-th predecessor, it adds it to its predecessors and discards all
descriptors in the interval of its (n+1)-th and its n-th predecessor.
(requires /1/)
[Dec 02: This function has not been implemented, because arbitrary nodes
what have been able to download the entire set of v2 descriptors. An
authorized replication request would be necessary. For the moment, the
system runs without any directory-side replication. -KL]
Authoritative directory nodes:
/9/ Confirm a router's hidden service directory functionality
Directory nodes include a new flag "HSDir" for routers that decided to
provide storage for hidden service descriptors and that are running for at
least 24 hours. The last requirement prevents a node from frequently
changing its onion key to become responsible for an identifier it wants to
target.
Hidden service provider:
/10/ Configure v2 hidden service
Each hidden service provider that has set the config option
"PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
descriptors and conform to the v2 connection establishment protocol. When
configuring a hidden service, a hidden service provider checks if it has
already created a random secret_cookie and a hostname2 file; if not, it
creates both of them. (requires /2/)
/11/ Establish introduction points with fresh key
If configured to publish only v2 descriptors and no v0/v1 descriptors any
more, a hidden service provider that is setting up the hidden service at
introduction points does not pass its own public key, but the public key
of a freshly generated key pair. It also includes these fresh public keys
in the hidden service descriptor together with the other introduction point
information. The reason is that the introduction point does not need to and
therefore should not know for which hidden service it works, so as to
prevent it from tracking the hidden service's activity. (If a hidden
service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
rely on the fact that all introduction points accept the same public key,
so that this new feature cannot be used.)
/12/ Encode v2 descriptors and send v2 publish requests
If configured to publish v2 descriptors, a hidden service provider
publishes a new descriptor whenever its content changes or a new
publication period starts for this descriptor. If the current publication
period would only last for less than 60 minutes (= 2 x 30 minutes to allow
the server to be 30 minutes behind and the client 30 minutes ahead), the
hidden service provider publishes both a current descriptor and one for
the next period. Publication is performed by sending the descriptor to all
hidden service directories that are responsible for keeping replicas for
the descriptor ID. This includes two non-consecutive replicas that are
stored at 3 consecutive nodes each. (requires /1/ and /2/)
Hidden service client:
/13/ Send v2 fetch requests
A hidden service client that has set the config option
"FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
addresses by requesting a v2 descriptor from a randomly chosen hidden
service directory that is responsible for keeping replica for the
descriptor ID. In total there are six replicas of which the first and the
last three are stored on consecutive nodes. The probability of picking one
of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
fact that the availability will be the highest on the node with next higher
ID. A hidden service client relies on the hidden service provider to store
two sets of descriptors to compensate clock skew between service and
client. (requires /1/ and /2/)
/14/ Process v2 fetch reply and parse v2 descriptors
A hidden service client that has sent a request for a v2 descriptor can
parse it and store it to the local cache of rendezvous service descriptors.
/15/ Establish connection to v2 hidden service
A hidden service client can establish a connection to a hidden service
using a v2 descriptor. This includes using the secret cookie for decrypting
the introduction points contained in the descriptor. When contacting an
introduction point, the client does not use the public key of the hidden
service provider, but the freshly-generated public key that is included in
the hidden service descriptor. Whether or not a fresh key is used instead
of the key of the hidden service depends on the available protocol versions
that are included in the descriptor; by this, connection establishment is
to a certain extend decoupled from fetching the descriptor.
Hidden service descriptor:
(Requirements concerning the descriptor format are contained in /6/ and /7/.)
The new v2 hidden service descriptor format looks like this:
onion-address = h(public-key) + cookie
descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
descriptor-content = {
descriptor-id,
version,
public-key,
h(time-period + cookie + replica),
timestamp,
protocol-versions,
{ introduction-points } encrypted with cookie
} signed with private-key
The "descriptor-id" needs to change periodically in order for the
descriptor to be stored on changing nodes over time. It may only be
computable by a hidden service provider and all of his clients to prevent
unauthorized nodes from tracking the service activity by periodically
checking whether there is a descriptor for this service. Finally, the
hidden service directory needs to be able to verify that the hidden service
provider is the true originator of the descriptor with the given ID.
Therefore, "descriptor-id" is derived from the "public-key" of the hidden
service provider, the current "time-period" which changes every 24 hours,
a secret "cookie" shared between hidden service provider and clients, and
a "replica" denoting the number of this non-consecutive replica. (The
"time-period" is constructed in a way that time periods do not change at
the same moment for all descriptors by deriving a value between 0:00 and
23:59 hours from h(public-key) and making the descriptors of this hidden
service provider expire at that time of the day.) The "descriptor-id" is
defined to be 160 bits long. [extending the "descriptor-id" length
suggested by LØ]
Only the hidden service provider and the clients are able to generate
future "descriptor-ID"s. Hence, the "onion-address" is extended from now
the hash value of "public-key" by the secret "cookie". The "public-key" is
determined to be 80 bits long, whereas the "cookie" is dimensioned to be
120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
quite a lot to handle for a human, but necessary to provide sufficient
protection against an adversary from generating a key pair with same
"public-key" hash or guessing the "cookie".
A hidden service directory can verify that a descriptor was created by the
hidden service provider by checking if the "descriptor-id" corresponds to
the "public-key" and if the signature can be verified with the
"public-key".
The "introduction-points" that are included in the descriptor are encrypted
using the same "cookie" that is shared between hidden service provider and
clients. [correction to use another key than h(time-period + cookie) as
encryption key for introduction points made by LØ]
A new text-based format is proposed for descriptors instead of an extension
of the existing binary format for reasons of future extensibility.
Security implications:
The security implications of the proposed changes are grouped by the roles of
nodes that could perform attacks or on which attacks could be performed.
Attacks by authoritative directory nodes
Authoritative directory nodes are no longer the single places in the
network that know about a hidden service's activity and introduction
points. Thus, they cannot perform attacks using this information, e.g.
track a hidden service's activity or usage pattern or attack its
introduction points. Formerly, it would only require a single corrupted
authoritative directory operator to perform such an attack.
Attacks by hidden service directory nodes
A hidden service directory node could misuse a stored descriptor to track a
hidden service's activity and usage pattern by clients. Though there is no
countermeasure against this kind of attack, it is very expensive to track a
certain hidden service over time. An attacker would need to run a large
number of stable onion routers that work as hidden service directory nodes
to have a good probability to become responsible for its changing
descriptor IDs. For each period, the probability is:
1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
as total
number of hidden service directories, c as compromised nodes, and r as
number of replicas
The hidden service directory nodes could try to make a certain hidden
service unavailable to its clients. Therefore, they could discard all
stored descriptors for that hidden service and reply to clients that there
is no descriptor for the given ID or return an old or false descriptor
content. The client would detect a false descriptor, because it could not
contain a correct signature. But an old content or an empty reply could
confuse the client. Therefore, the countermeasure is to replicate
descriptors among a small number of hidden service directories, e.g. 5.
The probability of a group of collaborating nodes to make a hidden service
completely unavailable is in each period:
(c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
with N as total
number of hidden service directories, c as compromised nodes, and r as
number of replicas
A hidden service directory could try to find out which introduction points
are working on behalf of a hidden service. In contrast to the previous
design, this is not possible anymore, because this information is encrypted
to the clients of a hidden service.
Attacks on hidden service directory nodes
An anonymous attacker could try to swamp a hidden service directory with
false descriptors for a given descriptor ID. This is prevented by requiring
that descriptors are signed.
Anonymous attackers could swamp a hidden service directory with correct
descriptors for non-existing hidden services. There is no countermeasure
against this attack. However, the creation of valid descriptors is more
expensive than verification and storage in local memory. This should make
this kind of attack unattractive.
Attacks by introduction points
Current or former introduction points could try to gain information on the
hidden service they serve. But due to the fresh key pair that is used by
the hidden service, this attack is not possible anymore.
Attacks by clients
Current or former clients could track a hidden service's activity, attack
its introduction points, or determine the responsible hidden service
directory nodes and attack them. There is nothing that could prevent them
from doing so, because honest clients need the full descriptor content to
establish a connection to the hidden service. At the moment, the only
countermeasure against dishonest clients is to change the secret cookie and
pass it only to the honest clients.
Compatibility:
The proposed design is meant to replace the current design for hidden service
descriptors and their storage in the long run.
There should be a first transition phase in which both, the current design
and the proposed design are served in parallel. Onion routers should start
serving as hidden service directories, and hidden service providers and
clients should make use of the new design if both sides support it. Hidden
service providers should be allowed to publish descriptors of the current
format in parallel, and authoritative directories should continue storing and
serving these descriptors.
After the first transition phase, hidden service providers should stop
publishing descriptors on authoritative directories, and hidden service
clients should not try to fetch descriptors from the authoritative
directories. However, the authoritative directories should continue serving
hidden service descriptors for a second transition phase. As of this point,
all v2 config options should be set to a default value of 1.
After the second transition phase, the authoritative directories should stop
serving hidden service descriptors.

View File

@ -1,385 +0,0 @@
Filename: 115-two-hop-paths.txt
Title: Two Hop Paths
Author: Mike Perry
Created:
Status: Dead
Supersedes: 112
Overview:
The idea is that users should be able to choose if they would like
to have either two or three hop paths through the tor network.
Let us be clear: the users who would choose this option should be
those that are concerned with IP obfuscation only: ie they would not be
targets of a resource-intensive multi-node attack. It is sometimes said
that these users should find some other network to use other than Tor.
This is a foolish suggestion: more users improves security of everyone,
and the current small userbase size is a critical hindrance to
anonymity, as is discussed below and in [1].
This value should be modifiable from the controller, and should be
available from Vidalia.
Motivation:
The Tor network is slow and overloaded. Increasingly often I hear
stories about friends and friends of friends who are behind firewalls,
annoying censorware, or under surveillance that interferes with their
productivity and Internet usage, or chills their speech. These people
know about Tor, but they choose to put up with the censorship because
Tor is too slow to be usable for them. In fact, to download a fresh,
complete copy of levine-timing.pdf for the Theoretical Argument
section of this proposal over Tor took me 3 tries.
Furthermore, the biggest current problem with Tor's anonymity for
those who really need it is not someone attacking the network to
discover who they are. It's instead the extreme danger that so few
people use Tor because it's so slow, that those who do use it have
essentially no confusion set.
The recent case where the professor and the rogue Tor user were the
only Tor users on campus, and thus suspected in an incident involving
Tor and that University underscores this point: "That was why the police
had come to see me. They told me that only two people on our campus were
using Tor: me and someone they suspected of engaging in an online scam.
The detectives wanted to know whether the other user was a former
student of mine, and why I was using Tor"[1].
Not only does Tor provide no anonymity if you use it to be anonymous
but are obviously from a certain institution, location or circumstance,
it is also dangerous to use Tor for risk of being accused of having
something significant enough to hide to be willing to put up with
the horrible performance as opposed to using some weaker alternative.
There are many ways to improve the speed problem, and of course we
should and will implement as many as we can. Johannes's GSoC project
and my reputation system are longer term, higher-effort things that
will still provide benefit independent of this proposal.
However, reducing the path length to 2 for those who do not need the
extra anonymity 3 hops provide not only improves their Tor experience
but also reduces their load on the Tor network by 33%, and should
increase adoption of Tor by a good deal. That's not just Win-Win, it's
Win-Win-Win.
Who will enable this option?
This is the crux of the proposal. Admittedly, there is some anonymity
loss and some degree of decreased investment required on the part of
the adversary to attack 2 hop users versus 3 hop users, even if it is
minimal and limited mostly to up-front costs and false positives.
The key questions are:
1. Are these users in a class such that their risk is significantly
less than the amount of this anonymity loss?
2. Are these users able to identify themselves?
Many many users of Tor are not at risk for an adversary capturing c/n
nodes of the network just to see what they do. These users use Tor to
circumvent aggressive content filters, or simply to keep their IP out of
marketing and search engine databases. Most content filters have no
interest in running Tor nodes to catch violators, and marketers
certainly would never consider such a thing, both on a cost basis and a
legal one.
In a sense, this represents an alternate threat model against these
users who are not at risk for Tor's normal threat model.
It should be evident to these users that they fall into this class. All
that should be needed is a radio button
* "I use Tor for local content filter circumvention and/or IP obfuscation,
not anonymity. Speed is more important to me than high anonymity.
No one will make considerable efforts to determine my real IP."
* "I use Tor for anonymity and/or national-level, legally enforced
censorship. It is possible effort will be taken to identify
me, including but not limited to network surveillance. I need more
protection."
and then some explanation in the help for exactly what this means, and
the risks involved with eliminating the adversary's need for timing
attacks with respect to false positives. Ultimately, the decision is a
simple one that can be made without this information, however. The user
does not need Paul Syverson to instruct them on the deep magic of Onion
Routing to make this decision. They just need to know why they use Tor.
If they use it just to stay out of marketing databases and/or bypass a
local content filter, two hops is plenty. This is likely the vast
majority of Tor users, and many non-users we would like to bring on
board.
So, having established this class of users, let us now go on to
examine theoretical and practical risks we place them at, and determine
if these risks violate the users needs, or introduce additional risk
to node operators who may be subject to requests from law enforcement
to track users who need 3 hops, but use 2 because they enjoy the
thrill of russian roulette.
Theoretical Argument:
It has long been established that timing attacks against mixed
and onion networks are extremely effective, and that regardless
of path length, if the adversary has compromised your first and
last hop of your path, you can assume they have compromised your
identity for that connection.
In fact, it was demonstrated that for all but the slowest, lossiest
networks, error rates for false positives and false negatives were
very near zero[2]. Only for constant streams of traffic over slow and
(more importantly) extremely lossy network links did the error rate
hit 20%. For loss rates typical to the Internet, even the error rate
for slow nodes with constant traffic streams was 13%.
When you take into account that most Tor streams are not constant,
but probably much more like their "HomeIP" dataset, which consists
mostly of web traffic that exists over finite intervals at specific
times, error rates drop to fractions of 1%, even for the "worst"
network nodes.
Therefore, the user has little benefit from the extra hop, assuming
the adversary does timing correlation on their nodes. Since timing
correlation is simply an implementation issue and is most likely
a single up-front cost (and one that is like quite a bit cheaper
than the cost of the machines purchased to host the nodes to mount
an attack), the real protection is the low probability of getting
both the first and last hop of a client's stream.
Practical Issues:
Theoretical issues aside, there are several practical issues with the
implementation of Tor that need to be addressed to ensure that
identity information is not leaked by the implementation.
Exit policy issues:
If a client chooses an exit with a very restrictive exit policy
(such as an IP or IP range), the first hop then knows a good deal
about the destination. For this reason, clients should not select
exits that match their destination IP with anything other than "*".
Partitioning:
Partitioning attacks form another concern. Since Tor uses telescoping
to build circuits, it is possible to tell a user is constructing only
two hop paths at the entry node and on the local network. An external
adversary can potentially differentiate 2 and 3 hop users, and decide
that all IP addresses connecting to Tor and using 3 hops have something
to hide, and should be scrutinized more closely or outright apprehended.
One solution to this is to use the "leaky-circuit" method of attaching
streams: The user always creates 3-hop circuits, but if the option
is enabled, they always exit from their 2nd hop. The ideal solution
would be to create a RELAY_SHISHKABOB cell which contains onion
skins for every host along the path, but this requires protocol
changes at the nodes to support.
Guard nodes:
Since guard nodes can rotate due to client relocation, network
failure, node upgrades and other issues, if you amortize the risk a
mobile, dialup, or otherwise intermittently connected user is exposed to
over any reasonable duration of Tor usage (on the order of a year), it
is the same with or without guard nodes. Assuming an adversary has
c%/n% of network bandwidth, and guards rotate on average with period R,
statistically speaking, it's merely a question of if the user wishes
their risk to be concentrated with probability c/n over an expected
period of R*c, and probability 0 over an expected period of R*(n-c),
versus a continuous risk of (c/n)^2. So statistically speaking, guards
only create a time-tradeoff of risk over the long run for normal Tor
usage. Rotating guards do not reduce risk for normal client usage long
term.[3]
On other other hand, assuming a more stable method of guard selection
and preservation is devised, or a more stable client side network than
my own is typical (which rotates guards frequently due to network issues
and moving about), guard nodes provide a tradeoff in the form of c/n% of
the users being "sacrificial users" who are exposed to high risk O(c/n)
of identification, while the rest of the network is exposed to zero
risk.
The nature of Tor makes it likely an adversary will take a "shock and
awe" approach to suppressing Tor by rounding up a few users whose
browsing activity has been observed to be made into examples, in an
attempt to prove that Tor is not perfect.
Since this "shock and awe" attack can be applied with or without guard
nodes, stable guard nodes do offer a measure of accountability of sorts.
If a user was using a small set of guard nodes and knows them well, and
then is suddenly apprehended as a result of Tor usage, having a fixed
set of entry points to suspect is a lot better than suspecting the whole
network. Conversely, it can also give non-apprehended users comfort
that they are likely to remain safe indefinitely with their set of (now
presumably trusted) guards. This is probably the most beneficial
property of reliable guards: they deter the adversary from mounting
"shock and awe" attacks because the surviving users will not
intimidated, but instead made more confident. Of course, guards need to
be made much more stable and users need to be encouraged to know their
guards for this property to really take effect.
This beneficial property of client vigilance also carries over to an
active adversary, except in this case instead of relying on the user
to remember their guard nodes and somehow communicate them after
apprehension, the code can alert them to the presence of an active
adversary before they are apprehended. But only if they use guard nodes.
So lets consider the active adversary: Two hop paths allow malicious
guards to get considerably more benefit from failing circuits if they do
not extend to their colluding peers for the exit hop. Since guards can
detect the number of hops in a path via either timing or by statistical
analysis of the exit policy of the 2nd hop, they can perform this attack
predominantly against 2 hop users.
This can be addressed by completely abandoning an entry guard after a
certain ratio of extend or general circuit failures with respect to
non-failed circuits. The proper value for this ratio can be determined
experimentally with TorFlow. There is the possibility that the local
network can abuse this feature to cause certain guards to be dropped,
but they can do that anyways with the current Tor by just making guards
they don't like unreachable. With this mechanism, Tor will complain
loudly if any guard failure rate exceeds the expected in any failure
case, local or remote.
Eliminating guards entirely would actually not address this issue due
to the time-tradeoff nature of risk. In fact, it would just make it
worse. Without guard nodes, it becomes much more difficult for clients
to become alerted to Tor entry points that are failing circuits to make
sure that they only devote bandwidth to carry traffic for streams which
they observe both ends. Yet the rogue entry points are still able to
significantly increase their success rates by failing circuits.
For this reason, guard nodes should remain enabled for 2 hop users,
at least until an IP-independent, undetectable guard scanner can
be created. TorFlow can scan for failing guards, but after a while,
its unique behavior gives away the fact that its IP is a scanner and
it can be given selective service.
Consideration of risks for node operators:
There is a serious risk for two hop users in the form of guard
profiling. If an adversary running an exit node notices that a
particular site is always visited from a fixed previous hop, it is
likely that this is a two hop user using a certain guard, which could be
monitored to determine their identity. Thus, for the protection of both
2 hop users and node operators, 2 hop users should limit their guard
duration to a sufficient number of days to verify reliability of a node,
but not much more. This duration can be determined experimentally by
TorFlow.
Considering a Tor client builds on average 144 circuits/day (10
minutes per circuit), if the adversary owns c/n% of exits on the
network, they can expect to see 144*c/n circuits from this user, or
about 14 minutes of usage per day per percentage of network penetration.
Since it will take several occurrences of user-linkable exit content
from the same predecessor hop for the adversary to have any confidence
this is a 2 hop user, it is very unlikely that any sort of demands made
upon the predecessor node would guaranteed to be effective (ie it
actually was a guard), let alone be executed in time to apprehend the
user before they rotated guards.
The reverse risk also warrants consideration. If a malicious guard has
orders to surveil Mike Perry, it can determine Mike Perry is using two
hops by observing his tendency to choose a 2nd hop with a viable exit
policy. This can be done relatively quickly, unfortunately, and
indicates Mike Perry should spend some of his time building real 3 hop
circuits through the same guards, to require them to at least wait for
him to actually use Tor to determine his style of operation, rather than
collect this information from his passive building patterns.
However, to actively determine where Mike Perry is going, the guard
will need to require logging ahead of time at multiple exit nodes that
he may use over the course of the few days while he is at that guard,
and correlate the usage times of the exit node with Mike Perry's
activity at that guard for the few days he uses it. At this point, the
adversary is mounting a scale and method of attack (widespread logging,
timing attacks) that works pretty much just as effectively against 3
hops, so exit node operators are exposed to no additional danger than
they otherwise normally are.
Why not fix Pathlen=2?:
The main reason I am not advocating that we always use 2 hops is that
in some situations, timing correlation evidence by itself may not be
considered as solid and convincing as an actual, uninterrupted, fully
traced path. Are these timing attacks as effective on a real network as
they are in simulation? Maybe the circuit multiplexing of Tor can serve
to frustrate them to a degree? Would an extralegal adversary or
authoritarian government even care? In the face of these situation
dependent unknowns, it should be up to the user to decide if this is
a concern for them or not.
It should probably also be noted that even a false positive
rate of 1% for a 200k concurrent-user network could mean that for a
given node, a given stream could be confused with something like 10
users, assuming ~200 nodes carry most of the traffic (ie 1000 users
each). Though of course to really know for sure, someone needs to do
an attack on a real network, unfortunately.
Additionally, at some point cover traffic schemes may be implemented to
frustrate timing attacks on the first hop. It is possible some expert
users may do this ad-hoc already, and may wish to continue using 3 hops
for this reason.
Implementation:
new_route_len() can be modified directly with a check of the
Pathlen option. However, circuit construction logic should be
altered so that both 2 hop and 3 hop users build the same types of
circuits, and the option should ultimately govern circuit selection,
not construction. This improves coverage against guard nodes being
able to passively profile users who aren't even using Tor.
PathlenCoinWeight, anyone? :)
The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
circuit_is_acceptable.
The leaky exit is trickier still.. handle_control_attachstream
does allow paths to exit at a given hop. Presumably something similar
can be done in connection_ap_handshake_process_socks, and elsewhere?
Circuit construction would also have to be performed such that the
2nd hop's exit policy is what is considered, not the 3rd's.
The entry_guard_t structure could have num_circ_failed and
num_circ_succeeded members such that if it exceeds F% circuit
extend failure rate to a second hop, it is removed from the entry list.
F should be sufficiently high to avoid churn from normal Tor circuit
failure as determined by TorFlow scans.
The Vidalia option should be presented as a radio button.
Migration:
Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
circuit ability, and 2-3 hop circuit selection logic governed by
Pathlen.
Phase 2: Experiment to determine the proper ratio of circuit
failures used to expire garbage or malicious guards via TorFlow
(pending Bug #440 backport+adoption).
Phase 3: Implement guard expiration code to kick off failure-prone
guards and warn the user. Cap 2 hop guard duration to a proper number
of days determined sufficient to establish guard reliability (to be
determined by TorFlow).
Phase 4: Make radiobutton in Vidalia, along with help entry
that explains in layman's terms the risks involved.
Phase 5: Allow user to specify path length by HTTP URL suffix.
[1] http://p2pnet.net/story/11279
[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
[3] Proof available upon request ;)

View File

@ -1,118 +0,0 @@
Filename: 116-two-hop-paths-from-guard.txt
Title: Two hop paths from entry guards
Author: Michael Lieberman
Created: 26-Jun-2007
Status: Dead
This proposal is related to (but different from) Mike Perry's proposal 115
"Two Hop Paths."
Overview:
Volunteers who run entry guards should have the option of using only 2
additional tor nodes when constructing their own tor circuits.
While the option of two hop paths should perhaps be extended to every client
(as discussed in Mike Perry's thread), I believe the anonymity properties of
two hop paths are particularly well-suited to client computers that are also
serving as entry guards.
First I will describe the details of the strategy, as well as possible
avenues of attack. Then I will list advantages and disadvantages. Then, I
will discuss some possibly safer variations of the strategy, and finally
some implementation issues.
Details:
Suppose Alice is an entry guard, and wants to construct a two hop circuit.
Alice chooses a middle node at random (not using the entry guard strategy),
and gains anonymity by having her traffic look just like traffic from
someone else using her as an entry guard.
Can Alice's middle node figure out that she is initiator of the traffic? I
can think of four possible approaches for distinguishing traffic from Alice
with traffic through Alice:
1) Notice that communication from Alice comes too fast: Experimentation is
needed to determine if traffic from Alice can be distinguished from traffic
from a computer with a decent link to Alice.
2) Monitor Alice's network traffic to discover the lack of incoming packets
at the appropriate times. If an adversary has this ability, then Alice
already has problems in the current system, because the adversary can run a
standard timing attack on Alice's traffic.
3) Notice that traffic from Alice is unique in some way such that if Alice
was just one of 3 entry guards for this traffic, then the traffic should be
coming from two other entry guards as well. An example of "unique traffic"
could be always sending 117 packets every 3 minutes to an exit node that
exits to port 4661. However, if such patterns existed with sufficient
precision, then it seems to me that Tor already has a problem. (This "unique
traffic" may not be a problem if clients often end up choosing a single
entry guard because their other two are down. Does anyone know if this is
the case?)
4) First, control the middle node *and* some other part of the traffic,
using standard attacks on a two hop circuit without entry nodes (my recent
paper on Browser-Based Attacks would work well for this
http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With
control of the circuit, we can now cause "unique traffic" as in 3).
Alternatively, if we know something about Alice independently, and we can
see what websites are being visited, we might be able to guess that she is
the kind of person that would visit those websites.
Anonymity Advantages:
-Alice never has the problem of choosing a malicious entry guard. In some
sense, Alice acts as her own entry guard.
Anonymity Disadvantages:
-If Alice's traffic is identified as originating from herself (see above for
how hard that might be), then she has the anonymity of a 2 hop circuit
without entry guards.
Additional advantages:
-A discussion of the latency advantages of two hop circuits is going on in
Mike Perry's thread already.
-Also, we can advertise this change as "Run an entry guard and decrease your
own Tor latency." This incentive has the potential to add nodes to the
network, improving the network as a whole.
Safer variations:
To solve the "unique traffic" problem, Alice could use two hop paths only
1/3 of the time, and choose 2 other entry guards for the other 2/3 of the
time. All the advantages are now 1/3 as useful (possibly more, if the other
2 entry guards are not always up).
To solve the problem that Alice's responses are too fast, Alice could delay
her responses (ideally based on some real data of response time when Alice
is used an entry guard). This loses most of the speed advantages of the two
hop path, but if Alice is a fast entry guard, it doesn't lose everything. It
also still has the (arguable) anonymity advantage that Alice doesn't have to
worry about having a malicious entry guard.
Implementation details:
For Alice to remain anonymous using this strategy, she has to actually be
acting as an entry guard for other nodes. This means the two hop option can
only be available to whatever high-performance threshold is currently set on
entry guards. Alice may need to somehow check her own current status as an
entry guard before choosing this two hop strategy.
Another thing to consider: suppose Alice is also an exit node. If the
fraction of exit nodes in existence is too small, she may rarely or never be
chosen as an entry guard. It would be sad if we offered an incentive to run
an entry guard that didn't extend to exit nodes. I suppose clients of Exit
nodes could pull the same trick, and bypass using Tor altogether (zero hop
paths), though that has additional issues.*
Mike Lieberman
MIT
*Why we shouldn't recommend Exit nodes pull the same trick:
1) Exit nodes would suffer heavily from the problem of "unique traffic"
mentioned above.
2) It would give governments an incentive to confiscate exit nodes to see if
they are pulling this trick.

View File

@ -1,410 +0,0 @@
Filename: 117-ipv6-exits.txt
Title: IPv6 exits
Author: coderman
Created: 10-Jul-2007
Status: Accepted
Target: 0.2.1.x
Overview
Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6
addresses. This proposal does not imply any IPv6 support for OR
traffic, only exit and name resolution.
Contents
0. Motivation
As the IPv4 address space becomes more scarce there is increasing
effort to provide Internet services via the IPv6 protocol. Many
hosts are available at IPv6 endpoints which are currently
inaccessible for Tor users.
Extending Tor to support IPv6 exit streams and IPv6 DNS name
resolution will allow users of the Tor network to access these hosts.
This capability would be present for those who do not currently have
IPv6 access, thus increasing the utility of Tor and furthering
adoption of IPv6.
1. Design
1.1. General design overview
There are three main components to this proposal. The first is a
method for routers to advertise their ability to exit IPv6 traffic.
The second is the manner in which routers resolve names to IPv6
addresses. Last but not least is the method in which clients
communicate with Tor to resolve and connect to IPv6 endpoints
anonymously.
1.2. Router IPv6 exit support
In order to specify exit policies and IPv6 capability new directives
in the Tor configuration will be needed. If a router advertises IPv6
exit policies in its descriptor this will signal the ability to
provide IPv6 exit. There are a number of additional default deny
rules associated with this new address space which are detailed in
the addendum.
When Tor is started on a host it should check for the presence of a
global unicast IPv6 address and if present include the default IPv6
exit policies and any user specified IPv6 exit policies.
If a user provides IPv6 exit policies but no global unicast IPv6
address is available Tor should generate a warning and not publish the
IPv6 policies in the router descriptor.
It should be noted that IPv4 mapped IPv6 addresses are not valid exit
destinations. This mechanism is mainly used to interoperate with
both IPv4 and IPv6 clients on the same socket. Any attempts to use
an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for
IPv4, must be refused.
1.3. DNS name resolution of IPv6 addresses (AAAA records)
In addition to exit support for IPv6 TCP connections, a method to
resolve domain names to their respective IPv6 addresses is also
needed. This is accomplished in the existing DNS system via AAAA
records. Routers will perform both A and AAAA requests when
resolving a name so that the client can utilize an IPv6 endpoint when
available or preferred.
To avoid potential problems with caching DNS servers that behave
poorly all NXDOMAIN responses to AAAA requests should be ignored if a
successful response is received for an A request. This implies that
both AAAA and A requests will always be performed for each name
resolution.
For reverse lookups on IPv6 addresses, like that used for
RESOLVE_PTR, Tor will perform the necessary PTR requests via
IP6.ARPA.
All routers which perform DNS resolution on behalf of clients
(RELAY_RESOLVE) should perform and respond with both A and AAAA
resources.
[NOTE: In a future version, when we extend the behavior of RESOLVE to
encapsulate more of real DNS, it will make sense to allow more
flexibility here. -nickm]
1.4. Client interaction with IPv6 exit capability
1.4.1. Usability goals
There are a number of behaviors which Tor can provide when
interacting with clients that will improve the usability of IPv6 exit
capability. These behaviors are designed to make it simple for
clients to express a preference for IPv6 transport and utilize IPv6
host services.
1.4.2. SOCKSv5 IPv6 client behavior
The SOCKS version 5 protocol supports IPv6 connections. When using
SOCKSv5 with hostnames it is difficult to determine if a client
wishes to use an IPv4 or IPv6 address to connect to the desired host
if it resolves to both address types.
In order to make this more intuitive the SOCKSv5 protocol can be
supported on a local IPv6 endpoint, [::1] port 9050 for example.
When a client requests a connection to the desired host via an IPv6
SOCKS connection Tor will prefer IPv6 addresses when resolving the
host name and connecting to the host.
Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS
connection will return IPv6 addresses when available, and fall back
to IPv4 addresses if not.
[NOTE: This means that SocksListenAddress and DNSListenAddress should
support IPv6 addresses. Perhaps there should also be a general option
to have listeners that default to 127.0.0.1 and 0.0.0.0 listen
additionally or instead on ::1 and :: -nickm]
1.4.3. MAPADDRESS behavior
The MAPADDRESS capability supports clients that may not be able to
use the SOCKSv4a or SOCKSv5 hostname support to resolve names via
Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as
well.
When a client requests an address mapping from the wildcard IPv6
address, [::0], the server will respond with a unique local IPv6
address on success. It is important to note that there may be two
mappings for the same name if both an IPv4 and IPv6 address are
associated with the host. In this case a CONNECT to a mapped IPv6
address should prefer IPv6 for the connection to the host, if
available, while CONNECT to a mapped IPv4 address will prefer IPv4.
It should be noted that IPv6 does not provide the concept of a host
local subnet, like 127.0.0.0/8 in IPv4. For this reason integration
of Tor with IPv6 clients should consider a firewall or filter rule to
drop unique local addresses to or from the network when possible.
These packets should not be routed, however, keeping them off the
subnet entirely is worthwhile.
1.4.3.1. Generating unique local IPv6 addresses
The usual manner of generating a unique local IPv6 address is to
select a Global ID part randomly, along with a Subnet ID, and sharing
this prefix among the communicating parties who each have their own
distinct Interface ID. In this style a given Tor instance might
select a random Global and Subnet ID and provide MAPADDRESS
assignments with a random Interface ID as needed. This has the
potential to associate unique Global/Subnet identifiers with a given
Tor instance and may expose attacks against the anonymity of Tor
users.
Tor avoid this potential problem entirely MAPADDRESS must always
generate the Global, Subnet, and Interface IDs randomly for each
request. It is also highly suggested that explicitly specifying an
IPv6 source address instead of the wildcard address not be supported
to ensure that a good random address is used.
1.4.4. DNSProxy IPv6 client behavior
A new capability in recent Tor versions is the transparent DNS proxy.
This feature will need to return both A and AAAA resource records
when responding to client name resolution requests.
The transparent DNS proxy should also support reverse lookups for
IPv6 addresses. It is suggested that any such requests to the
deprecated IP6.INT domain should be translated to IP6.ARPA instead.
This translation is not likely to be used and is of low priority.
It would be nice to support DNS over IPv6 transport as well, however,
this is not likely to be used and is of low priority.
1.4.5. TransPort IPv6 client behavior
Tor also provides transparent TCP proxy support via the Trans*
directives in the configuration. The TransListenAddress directive
should accept an IPv6 address in addition to IPv4 so that IPv6 TCP
connections can be transparently proxied.
1.5. Additional changes
The RedirectExit option should be deprecated rather than extending
this feature to IPv6.
2. Spec changes
2.1. Tor specification
In '6.2. Opening streams and transferring data' the following should
be changed to indicate IPv6 exit capability:
"No version of Tor currently generates the IPv6 format."
In '6.4. Remote hostname lookup' the following should be updated to
reflect use of ip6.arpa in addition to in-addr.arpa.
"For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an
in-addr.arpa address."
In 'A.1. Differences between spec and implementation' the following
should be updated to indicate IPv6 exit capability:
"The current codebase has no IPv6 support at all."
[NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an
ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2
type that can hold an ipv6 address, since the way we encode ipv6
addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6")
is a bit dumb. -nickm]
[Actually, the length field lets us distinguish EXITPOLICY. -nickm]
2.2. Directory specification
In '2.1. Router descriptor format' a new set of directives is needed
for IPv6 exit policy. The existing accept/reject directives should
be clarified to indicate IPv4 or wildcard address relevance. The new
IPv6 directives will be in the form of:
"accept6" exitpattern NL
"reject6" exitpattern NL
The section describing accept6/reject6 should explain that the
presence of accept6 or reject6 exit policies in a router descriptor
signals the ability of that router to exit IPv6 traffic (according to
IPv6 exit policies).
The "[::]/0" notation is used to represent "all IPv6 addresses".
"[::0]/0" may also be used for this representation.
If a user specifies a 'reject6 [::]/0:*' policy in the Tor
configuration this will be interpreted as forcing no IPv6 exit
support and no accept6/reject6 policies will be included in the
published descriptor. This will prevent IPv6 exit if the router host
has a global unicast IPv6 address present.
It is important to note that a wildcard address in an accept or
reject policy applies to both IPv4 and IPv6 addresses.
2.3. Control specification
In '3.8. MAPADDRESS' the potential to have to addresses for a given
name should be explained. The method for generating unique local
addresses for IPv6 mappings needs explanation as described above.
When IPv6 addresses are used in this document they should include the
brackets for consistency. For example, the null IPv6 address should
be written as "[::0]" and not "::0". The control commands will
expect the same syntax as well.
In '3.9. GETINFO' the "address" command should return both public
IPv4 and IPv6 addresses if present. These addresses should be
separated via \r\n.
2.4. Tor SOCKS extensions
In '2. Name lookup' a description of IPv6 address resolution is
needed for SOCKSv5 as described above. IPv6 addresses should be
supported in both the RESOLVE and RESOLVE_PTR extensions.
A new section describing the ability to accept SOCKSv5 clients on a
local IPv6 address to indicate a preference for IPv6 transport as
described above is also needed. The behavior of Tor SOCKSv5 proxy
with an IPv6 preference should be explained, for example, preferring
IPv6 transport to a named host with both IPv4 and IPv6 addresses
available (A and AAAA records).
3. Questions and concerns
3.1. DNS A6 records
A6 is explicitly avoided in this document. There are potential
reasons for implementing this, however, the inherent complexity of
the protocol and resolvers make this unappealing. Is there a
compelling reason to consider A6 as part of IPv6 exit support?
[IMO not till anybody needs it. -nickm]
3.2. IPv4 and IPv6 preference
The design above tries to infer a preference for IPv4 or IPv6
transport based on client interactions with Tor. It might be useful
to provide more explicit control over this preference. For example,
an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts
in CONNECT requests while the current implementation would assume an
IPv4 preference. Should more explicit control be available, through
either configuration directives or control commands?
Many applications support a inet6-only or prefer-family type option
that provides the user manual control over address preference. This
could be provided as a Tor configuration option.
An explicit preference is still possible by resolving names and then
CONNECTing to an IPv4 or IPv6 address as desired, however, not all
client applications may have this option available.
3.3. Support for IPv6 only transparent proxy clients
It may be useful to support IPv6 only transparent proxy clients using
IPv4 mapped IPv6 like addresses. This would require transparent DNS
proxy using IPv6 transport and the ability to map A record responses
into IPv4 mapped IPv6 like addresses in the manner described in the
"NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The
transparent TCP proxy would thus need to detect these mapped addresses
and connect to the desired IPv4 host.
The IPv6 prefix used for this purpose must not be the actual IPv4
mapped IPv6 address prefix, though the manner in which IPv4 addresses
are embedded in IPv6 addresses would be the same.
The lack of any IPv6 only hosts which would use this transparent proxy
method makes this a lot of work for very little gain. Is there a
compelling reason to support this NAT-PT like capability?
3.4. IPv6 DNS and older Tor routers
It is expected that many routers will continue to run with older
versions of Tor when the IPv6 exit capability is released. Clients
who wish to use IPv6 will need to route RELAY_RESOLVE requests to the
newer routers which will respond with both A and AAAA resource
records when possible.
One way to do this is to route RELAY_RESOLVE requests to routers with
IPv6 exit policies published, however, this would not utilize current
routers that can resolve IPv6 addresses even if they can't exit such
traffic.
There was also concern expressed about the ability of existing clients
to cope with new RELAY_RESOLVE responses that contain IPv6 addresses.
If this breaks backward compatibility, a new request type may be
necessary, like RELAY_RESOLVE6, or some other mechanism of indicating
the ability to parse IPv6 responses when making the request.
3.5. IPv4 and IPv6 bindings in MAPADDRESS
It may be troublesome to try and support two distinct address mappings
for the same name in the existing MAPADDRESS implementation. If this
cannot be accommodated then the behavior should replace existing
mappings with the new address regardless of family. A warning when
this occurs would be useful to assist clients who encounter problems
when both an IPv4 and IPv6 application are using MAPADDRESS for the
same names concurrently, causing lost connections for one of them.
4. Addendum
4.1. Sample IPv6 default exit policy
reject 0.0.0.0/8
reject 169.254.0.0/16
reject 127.0.0.0/8
reject 192.168.0.0/16
reject 10.0.0.0/8
reject 172.16.0.0/12
reject6 [0000::]/8
reject6 [0100::]/8
reject6 [0200::]/7
reject6 [0400::]/6
reject6 [0800::]/5
reject6 [1000::]/4
reject6 [4000::]/3
reject6 [6000::]/3
reject6 [8000::]/3
reject6 [A000::]/3
reject6 [C000::]/3
reject6 [E000::]/4
reject6 [F000::]/5
reject6 [F800::]/6
reject6 [FC00::]/7
reject6 [FE00::]/9
reject6 [FE80::]/10
reject6 [FEC0::]/10
reject6 [FF00::]/8
reject *:25
reject *:119
reject *:135-139
reject *:445
reject *:1214
reject *:4661-4666
reject *:6346-6429
reject *:6699
reject *:6881-6999
accept *:*
# accept6 [2000::]/3:* is implied
4.2. Additional resources
'DNS Extensions to Support IP Version 6'
http://www.ietf.org/rfc/rfc3596.txt
'DNS Extensions to Support IPv6 Address Aggregation and Renumbering'
http://www.ietf.org/rfc/rfc2874.txt
'SOCKS Protocol Version 5'
http://www.ietf.org/rfc/rfc1928.txt
'Unique Local IPv6 Unicast Addresses'
http://www.ietf.org/rfc/rfc4193.txt
'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE'
http://www.iana.org/assignments/ipv6-address-space
'Network Address Translation - Protocol Translation (NAT-PT)'
http://www.ietf.org/rfc/rfc2766.txt

View File

@ -1,84 +0,0 @@
Filename: 118-multiple-orports.txt
Title: Advertising multiple ORPorts at once
Author: Nick Mathewson
Created: 09-Jul-2007
Status: Accepted
Target: 0.2.1.x
Overview:
This document is a proposal for servers to advertise multiple
address/port combinations for their ORPort.
Motivation:
Sometimes servers want to support multiple ports for incoming
connections, either in order to support multiple address families, to
better use multiple interfaces, or to support a variety of
FascistFirewallPorts settings. This is easy to set up now, but
there's no way to advertise it to clients.
New descriptor syntax:
We add a new line in the router descriptor, "or-address". This line
can occur zero, one, or multiple times. Its format is:
or-address SP ADDRESS ":" PORTLIST NL
ADDRESS = IP6ADDR / IP4ADDR
IPV6ADDR = an ipv6 address, surrounded by square brackets.
IPV4ADDR = an ipv4 address, represented as a dotted quad.
PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
PORTSPEC = PORT | PORT "-" PORT
[This is the regular format for specifying sets of addresses and
ports in Tor.]
New OR behavior:
We add two more options to supplement ORListenAddress:
ORPublishedListenAddress, and ORPublishAddressSet. The former
listens on an address-port combination and publishes it in addition
to the regular address. The latter advertises a set of address-port
combinations, but does not listen on them. [To use this option, the
server operator should set up port forwarding to the regular ORPort,
as for example with firewall rules.]
Servers should extend their testing to include advertised addresses
and ports. No address or port should be advertised until it's been
tested. [This might get expensive in practice.]
New authority behavior:
Authorities should spot-test descriptors, and reject any where a
substantial part of the addresses can't be reached.
New client behavior:
When connecting to another server, clients SHOULD pick an
address-port ocmbination at random as supported by their
reachableaddresses. If a client has a connection to a server at one
address, it SHOULD use that address for any simultaneous connections
to that server. Clients SHOULD use the canonical address for any
server when generating extend cells.
Not addressed here:
* There's no reason to listen on multiple dirports; current Tors
mostly don't connect directly to the dirport anyway.
* It could be advantageous to list something about extra addresses in
the network-status document. This would, however, eat space there.
More analysis is needed, particularly in light of proposal 141
("Download server descriptors on demand")
Dependencies:
Testing for canonical connections needs to be implemented before it's
safe to use this proposal.
Notes 3 July:
- Write up the simple version of this. No ranges needed yet. No
networkstatus chagnes yet.

View File

@ -1,140 +0,0 @@
Filename: 119-controlport-auth.txt
Title: New PROTOCOLINFO command for controllers
Author: Roger Dingledine
Created: 14-Aug-2007
Status: Closed
Implemented-In: 0.2.0.x
Overview:
Here we describe how to help controllers locate the cookie
authentication file when authenticating to Tor, so we can a) require
authentication by default for Tor controllers and b) still keep
things usable. Also, we propose an extensible, general-purpose mechanism
for controllers to learn about a Tor instance's protocol and
authentication requirements before authenticating.
The Problem:
When we first added the controller protocol, we wanted to make it
easy for people to play with it, so by default we didn't require any
authentication from controller programs. We allowed requests only from
localhost as a stopgap measure for security.
Due to an increasing number of vulnerabilities based on this approach,
it's time to add authentication in default configurations.
We have a number of goals:
- We want the default Vidalia bundles to transparently work. That
means we don't want the users to have to type in or know a password.
- We want to allow multiple controller applications to connect to the
control port. So if Vidalia is launching Tor, it can't just keep the
secrets to itself.
Right now there are three authentication approaches supported
by the control protocol: NULL, CookieAuthentication, and
HashedControlPassword. See Sec 5.1 in control-spec.txt for details.
There are a couple of challenges here. The first is: if the controller
launches Tor, how should we teach Tor what authentication approach
it should require, and the secret that goes along with it? Next is:
how should this work when the controller attaches to an existing Tor,
rather than launching Tor itself?
Cookie authentication seems most amenable to letting multiple controller
applications interact with Tor. But that brings in yet another question:
how does the controller guess where to look for the cookie file,
without first knowing what DataDirectory Tor is using?
Design:
We should add a new controller command PROTOCOLINFO that can be sent
as a valid first command (the others being AUTHENTICATE and QUIT). If
PROTOCOLINFO is sent as the first command, the second command must be
either a successful AUTHENTICATE or a QUIT.
If the initial command sequence is not valid, Tor closes the connection.
Spec:
C: "PROTOCOLINFO" *(SP PIVERSION) CRLF
S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF
InfoLine = AuthLine / VersionLine / OtherLine
AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod
*(SP "COOKIEFILE=" AuthCookieFile) CRLF
VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF
AuthMethod =
"NULL" / ; No authentication is required
"HASHEDPASSWORD" / ; A controller must supply the original password
"COOKIE" / ; A controller must supply the contents of a cookie
AuthCookieFile = QuotedString
TorVersion = QuotedString
OtherLine = "250-" Keyword [SP Arguments] CRLF
For example:
C: PROTOCOLINFO CRLF
S: "250+PROTOCOLINFO 1" CRLF
S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF
S: "250-VERSION Tor=0.2.0.5-alpha" CRLF
S: "250 OK" CRLF
Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines
with keywords it does not recognize. Controllers MUST ignore extraneous
data on any InfoLine.
PIVERSION is there in case we drastically change the syntax one day. For
now it should always be "1", for the controller protocol. Controllers MAY
provide a list of the protocol versions they support; Tor MAY select a
version that the controller does not support.
Right now only two "topics" (AUTH and VERSION) are included, but more
may be included in the future. Controllers must accept lines with
unexpected topics.
AuthCookieFile = QuotedString
AuthMethod is used to specify one or more control authentication
methods that Tor currently accepts.
AuthCookieFile specifies the absolute path and filename of the
authentication cookie that Tor is expecting and is provided iff
the METHODS field contains the method "COOKIE". Controllers MUST handle
escape sequences inside this string.
The VERSION line contains the Tor version.
[What else might we want to include that could be useful? -RD]
Compatibility:
Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed
command. Earlier Tors don't know about this command but don't hang
up. That means controllers will need a mechanism for distinguishing
whether they're talking to a Tor that speaks PROTOCOLINFO or not.
I suggest that the controllers attempt a PROTOCOLINFO. Then:
- If it works, great. Authenticate as required.
- If they get hung up on, reconnect and do a NULL AUTHENTICATE.
- If it's unrecognized but they're not hung up on, do a NULL
AUTHENTICATE.
Unsolved problems:
If Torbutton wants to be a Tor controller one day... talking TCP is
bad enough, but reading from the filesystem is even harder. Is there
a way to let simple programs work with the controller port without
needing all the auth infrastructure?
Once we put this approach in place, the next vulnerability we see will
involve an attacker somehow getting read access to the victim's files
--- and then we're back where we started. This means we still need
to think about how to demand password-based authentication without
bothering the user about it.

View File

@ -1,83 +0,0 @@
Filename: 120-shutdown-descriptors.txt
Title: Shutdown descriptors when Tor servers stop
Author: Roger Dingledine
Created: 15-Aug-2007
Status: Dead
[Proposal dead as of 11 Jul 2008. The point of this proposal was to give
routers a good way to get out of the networkstatus early, but proposal
138 (already implemented) has achieved this.]
Overview:
Tor servers should publish a last descriptor whenever they shut down,
to let others know that they are no longer offering service.
The Problem:
The main reason for this is in reaction to Internet services that want
to treat connections from the Tor network differently. Right now,
if a user experiments with turning on the "relay" functionality, he
is punished by being locked out of some websites, some IRC networks,
etc --- and this lockout persists for several days even after he turns
the server off.
Design:
During the "slow shutdown" period if exiting, or shortly after the
user sets his ORPort back to 0 if not exiting, Tor should publish a
final descriptor with the following characteristics:
1) Exit policy is listed as "reject *:*"
2) It includes a new entry called "opt shutdown 1"
The first step is so current blacklists will no longer list this node
as exiting to whatever the service is.
The second step is so directory authorities can avoid wasting time
doing reachability testing. Authorities should automatically not list
as Running any router whose latest descriptor says it shut down.
[I originally had in mind a third step --- Advertised bandwidth capacity
is listed as "0" --- so current Tor clients will skip over this node
when building most circuits. But since clients won't fetch descriptors
from nodes not listed as Running, this step seems pointless. -RD]
Spec:
TBD but should be pretty straightforward.
Security issues:
Now external people can learn exactly when a node stopped offering
relay service. How bad is this? I can see a few minor attacks based
on this knowledge, but on the other hand as it is we don't really take
any steps to keep this information secret.
Overhead issues:
We are creating more descriptors that want to be remembered. However,
since the router won't be marked as Running, ordinary clients won't
fetch the shutdown descriptors. Caches will, though. I hope this is ok.
Implementation:
To make things easy, we should publish the shutdown descriptor only
on controlled shutdown (SIGINT as opposed to SIGTERM). That would
leave enough time for publishing that we probably wouldn't need any
extra synchronization code.
If that turns out to be too unintuitive for users, I could imagine doing
it on SIGTERMs too, and just delaying exit until we had successfully
published to at least one authority, at which point we'd hope that it
propagated from there.
Acknowledgements:
tup suggested this idea.
Comments:
2) Maybe add a rule "Don't do this for hibernation if we expect to wake
up before the next consensus is published"?
- NM 9 Oct 2007

View File

@ -1,776 +0,0 @@
Filename: 121-hidden-service-authentication.txt
Title: Hidden Service Authentication
Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
Christoph Weingarten
Created: 10-Sep-2007
Status: Finished
Implemented-In: 0.2.1.x
Change history:
26-Sep-2007 Initial proposal for or-dev
08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007
15-Dec-2007 Rewrote complete proposal for better readability, modified
authentication protocol, merged in personal notes
24-Dec-2007 Replaced misleading term "authentication" by "authorization"
and added some clarifications (comments by Sven Kaffille)
28-Apr-2008 Updated most parts of the concrete authorization protocol
04-Jul-2008 Add a simple algorithm to delay descriptor publication for
different clients of a hidden service
19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay
protection for INTRODUCE2 cells (1.3), described limitations
for auth protocols (1.6), improved hidden service protocol
without client authorization (2.1), added second, more
scalable authorization protocol (2.2), rewrote existing
authorization protocol (2.3); changes based on discussion
with Nick
31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent
abuse.
01-Aug-2008 Use first part of Diffie-Hellman handshake for replay
protection instead of rendezvous cookie.
01-Aug-2008 Remove improved hidden service protocol without client
authorization (2.1). It might get implemented in proposal
142.
Overview:
This proposal deals with a general infrastructure for performing
authorization (not necessarily implying authentication) of requests to
hidden services at three points: (1) when downloading and decrypting
parts of the hidden service descriptor, (2) at the introduction point,
and (3) at Bob's Tor client before contacting the rendezvous point. A
service provider will be able to restrict access to his service at these
three points to authorized clients only. Further, the proposal contains
specific authorization protocols as instances that implement the
presented authorization infrastructure.
This proposal is based on v2 hidden service descriptors as described in
proposal 114 and introduced in version 0.2.0.10-alpha.
The proposal is structured as follows: The next section motivates the
integration of authorization mechanisms in the hidden service protocol.
Then we describe a general infrastructure for authorization in hidden
services, followed by specific authorization protocols for this
infrastructure. At the end we discuss a number of attacks and non-attacks
as well as compatibility issues.
Motivation:
The major part of hidden services does not require client authorization
now and won't do so in the future. To the contrary, many clients would
not want to be (pseudonymously) identifiable by the service (though this
is unavoidable to some extent), but rather use the service
anonymously. These services are not addressed by this proposal.
However, there may be certain services which are intended to be accessed
by a limited set of clients only. A possible application might be a
wiki or forum that should only be accessible for a closed user group.
Another, less intuitive example might be a real-time communication
service, where someone provides a presence and messaging service only to
his buddies. Finally, a possible application would be a personal home
server that should be remotely accessed by its owner.
Performing authorization for a hidden service within the Tor network, as
proposed here, offers a range of advantages compared to allowing all
client connections in the first instance and deferring authorization to
the transported protocol:
(1) Reduced traffic: Unauthorized requests would be rejected as early as
possible, thereby reducing the overall traffic in the network generated
by establishing circuits and sending cells.
(2) Better protection of service location: Unauthorized clients could not
force Bob to create circuits to their rendezvous points, thus preventing
the attack described by Øverlier and Syverson in their paper "Locating
Hidden Servers" even without the need for guards.
(3) Hiding activity: Apart from performing the actual authorization, a
service provider could also hide the mere presence of his service from
unauthorized clients when not providing hidden service descriptors to
them, rejecting unauthorized requests already at the introduction
point (ideally without leaking presence information at any of these
points), or not answering unauthorized introduction requests.
(4) Better protection of introduction points: When providing hidden
service descriptors to authorized clients only and encrypting the
introduction points as described in proposal 114, the introduction points
would be unknown to unauthorized clients and thereby protected from DoS
attacks.
(5) Protocol independence: Authorization could be performed for all
transported protocols, regardless of their own capabilities to do so.
(6) Ease of administration: A service provider running multiple hidden
services would be able to configure access at a single place uniformly
instead of doing so for all services separately.
(7) Optional QoS support: Bob could adapt his node selection algorithm
for building the circuit to Alice's rendezvous point depending on a
previously guaranteed QoS level, thus providing better latency or
bandwidth for selected clients.
A disadvantage of performing authorization within the Tor network is
that a hidden service cannot make use of authorization data in
the transported protocol. Tor hidden services were designed to be
independent of the transported protocol. Therefore it's only possible to
either grant or deny access to the whole service, but not to specific
resources of the service.
Authorization often implies authentication, i.e. proving one's identity.
However, when performing authorization within the Tor network, untrusted
points should not gain any useful information about the identities of
communicating parties, neither server nor client. A crucial challenge is
to remain anonymous towards directory servers and introduction points.
However, trying to hide identity from the hidden service is a futile
task, because a client would never know if he is the only authorized
client and therefore perfectly identifiable. Therefore, hiding client
identity from the hidden service is not an aim of this proposal.
The current implementation of hidden services does not provide any kind
of authorization. The hidden service descriptor version 2, introduced by
proposal 114, was designed to use a descriptor cookie for downloading and
decrypting parts of the descriptor content, but this feature is not yet
in use. Further, most relevant cell formats specified in rend-spec
contain fields for authorization data, but those fields are neither
implemented nor do they suffice entirely.
Details:
1. General infrastructure for authorization to hidden services
We spotted three possible authorization points in the hidden service
protocol:
(1) when downloading and decrypting parts of the hidden service
descriptor,
(2) at the introduction point, and
(3) at Bob's Tor client before contacting the rendezvous point.
The general idea of this proposal is to allow service providers to
restrict access to some or all of these points to authorized clients
only.
1.1. Client authorization at directory
Since the implementation of proposal 114 it is possible to combine a
hidden service descriptor with a so-called descriptor cookie. If done so,
the descriptor cookie becomes part of the descriptor ID, thus having an
effect on the storage location of the descriptor. Someone who has learned
about a service, but is not aware of the descriptor cookie, won't be able
to determine the descriptor ID and download the current hidden service
descriptor; he won't even know whether the service has uploaded a
descriptor recently. Descriptor IDs are calculated as follows (see
section 1.2 of rend-spec for the complete specification of v2 hidden
service descriptors):
descriptor-id =
H(service-id | H(time-period | descriptor-cookie | replica))
Currently, service-id is equivalent to permanent-id which is calculated
as in the following formula. But in principle it could be any public
key.
permanent-id = H(permanent-key)[:10]
The second purpose of the descriptor cookie is to encrypt the list of
introduction points, including optional authorization data. Hence, the
hidden service directories won't learn any introduction information from
storing a hidden service descriptor. This feature is implemented but
unused at the moment. So this proposal will harness the advantages
of proposal 114.
The descriptor cookie can be used for authorization by keeping it secret
from everyone but authorized clients. A service could then decide whether
to publish hidden service descriptors using that descriptor cookie later
on. An authorized client being aware of the descriptor cookie would be
able to download and decrypt the hidden service descriptor.
The number of concurrently used descriptor cookies for one hidden service
is not restricted. A service could use a single descriptor cookie for all
users, a distinct cookie per user, or something in between, like one
cookie per group of users. It is up to the specific protocol and how it
is applied by a service provider.
Two or more hidden service descriptors for different groups or users
should not be uploaded at the same time. A directory node could conclude
easily that the descriptors were issued by the same hidden service, thus
being able to link the two groups or users. Therefore, descriptors for
different users or clients that ought to be stored on the same directory
are delayed, so that only one descriptor is uploaded to a directory at a
time. The remaining descriptors are uploaded with a delay of up to
30 seconds.
Further, descriptors for different groups or users that are to be stored
on different directories are delayed for a random time of up to 30
seconds to hide relations from colluding directories. Certainly, this
does not prevent linking entirely, but it makes it somewhat harder.
There is a conflict between hiding links between clients and making a
service available in a timely manner.
Although this part of the proposal is meant to describe a general
infrastructure for authorization, changing the way of using the
descriptor cookie to look up hidden service descriptors, e.g. applying
some sort of asymmetric crypto system, would require in-depth changes
that would be incompatible to v2 hidden service descriptors. On the
contrary, using another key for en-/decrypting the introduction point
part of a hidden service descriptor, e.g. a different symmetric key or
asymmetric encryption, would be easy to implement and compatible to v2
hidden service descriptors as understood by hidden service directories
(clients and services would have to be upgraded anyway for using the new
features).
An adversary could try to abuse the fact that introduction points can be
encrypted by storing arbitrary, unrelated data in the hidden service
directory. This abuse can be limited by setting a hard descriptor size
limit, forcing the adversary to split data into multiple chunks. There
are some limitations that make splitting data across multiple descriptors
unattractive: 1) The adversary would not be able to choose descriptor IDs
freely and would therefore have to implement his own indexing
structure. 2) Validity of descriptors is limited to at most 24 hours
after which descriptors need to be republished.
The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data.
A large descriptor with 7 introduction points and 5 kilobytes of
authorization data would be 11724 bytes in size. The upper size limit of
descriptors should be set to 20 kilobytes, which limits the effect of
abuse while retaining enough flexibility in designing authorization
protocols.
1.2. Client authorization at introduction point
The next possible authorization point after downloading and decrypting
a hidden service descriptor is the introduction point. It may be important
for authorization, because it bears the last chance of hiding presence
of a hidden service from unauthorized clients. Further, performing
authorization at the introduction point might reduce traffic in the
network, because unauthorized requests would not be passed to the
hidden service. This applies to those clients who are aware of a
descriptor cookie and thereby of the hidden service descriptor, but do
not have authorization data to pass the introduction point or access the
service (such a situation might occur when authorization data for
authorization at the directory is not issued on a per-user basis, but
authorization data for authorization at the introduction point is).
It is important to note that the introduction point must be considered
untrustworthy, and therefore cannot replace authorization at the hidden
service itself. Nor should the introduction point learn any sensitive
identifiable information from either the service or the client.
In order to perform authorization at the introduction point, three
message formats need to be modified: (1) v2 hidden service descriptors,
(2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells.
A v2 hidden service descriptor needs to contain authorization data that
is introduction-point-specific and sometimes also authorization data
that is introduction-point-independent. Therefore, v2 hidden service
descriptors as specified in section 1.2 of rend-spec already contain two
reserved fields "intro-authorization" and "service-authorization"
(originally, the names of these fields were "...-authentication")
containing an authorization type number and arbitrary authorization
data. We propose that authorization data consists of base64 encoded
objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and
"-----END MESSAGE-----". This will increase the size of hidden service
descriptors, but this is allowed since there is no strict upper limit.
The current ESTABLISH_INTRO cells as described in section 1.3 of
rend-spec do not contain either authorization data or version
information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO
cells adding these two issues as follows:
V Format byte: set to 255 [1 octet]
V Version byte: set to 1 [1 octet]
KL Key length [2 octets]
PK Bob's public key [KL octets]
HS Hash of session info [20 octets]
AUTHT The auth type that is supported [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
SIG Signature of above information [variable]
From the format it is possible to determine the maximum allowed size for
authorization data: given the fact that cells are 512 octets long, of
which 498 octets are usable (see section 6.1 of tor-spec), and assuming
1024 bit = 128 octet long keys, there are 215 octets left for
authorization data. Hence, authorization protocols are bound to use no
more than these 215 octets, regardless of the number of clients that
shall be authenticated at the introduction point. Otherwise, one would
need to send multiple ESTABLISH_INTRO cells or split them up, which we do
not specify here.
In order to understand a v1 ESTABLISH_INTRO cell, the implementation of
a relay must have a certain Tor version. Hidden services need to be able
to distinguish relays being capable of understanding the new v1 cell
formats and perform authorization. We propose to use the version number
that is contained in networkstatus documents to find capable
introduction points.
The current INTRODUCE1 cell as described in section 1.8 of rend-spec is
not designed to carry authorization data and has no version number, too.
Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size,
seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This
makes it impossible to distinguish unversioned INTRODUCE1 cells from any
later format. In particular, it is not possible to introduce some kind of
format and version byte for newer versions of this cell. That's probably
where the comment "[XXX011 want to put intro-level auth info here, but no
version. crap. -RD]" that was part of rend-spec some time ago comes from.
We propose that new versioned INTRODUCE1 cells use the new cell type 41
RELAY_INTRODUCE1V (where V stands for versioned):
Cleartext
V Version byte: set to 1 [1 octet]
PK_ID Identifier for Bob's PK [20 octets]
AUTHT The auth type that is included [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
Encrypted to Bob's PK:
(RELAY_INTRODUCE2 cell)
The maximum length of contained authorization data depends on the length
of the contained INTRODUCE2 cell. A calculation follows below when
describing the INTRODUCE2 cell format we propose to use.
1.3. Client authorization at hidden service
The time when a hidden service receives an INTRODUCE2 cell constitutes
the last possible authorization point during the hidden service
protocol. Performing authorization here is easier than at the other two
authorization points, because there are no possibly untrusted entities
involved.
In general, a client that is successfully authorized at the introduction
point should be granted access at the hidden service, too. Otherwise, the
client would receive a positive INTRODUCE_ACK cell from the introduction
point and conclude that it may connect to the service, but the request
will be dropped without notice. This would appear as a failure to
clients. Therefore, the number of cases in which a client successfully
passes the introduction point but fails at the hidden service should be
zero. However, this does not lead to the conclusion that the
authorization data used at the introduction point and the hidden service
must be the same, but only that both authorization data should lead to
the same authorization result.
Authorization data is transmitted from client to server via an
INTRODUCE2 cell that is forwarded by the introduction point. There are
versions 0 to 2 specified in section 1.8 of rend-spec, but none of these
contain fields for carrying authorization data. We propose a slightly
modified version of v3 INTRODUCE2 cells that is specified in section
1.8.1 and which is not implemented as of December 2007. In contrast to
the specified v3 we avoid specifying (and implementing) IPv6 capabilities,
because Tor relays will be required to support IPv4 addresses for a long
time in the future, so that this seems unnecessary at the moment. The
proposed format of v3 INTRODUCE2 cells is as follows:
VER Version byte: set to 3. [1 octet]
AUTHT The auth type that is used [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
TS Timestamp (seconds since 1-1-1970) [4 octets]
IP Rendezvous point's address [4 octets]
PORT Rendezvous point's OR port [2 octets]
ID Rendezvous point identity ID [20 octets]
KLEN Length of onion key [2 octets]
KEY Rendezvous point onion key [KLEN octets]
RC Rendezvous cookie [20 octets]
g^x Diffie-Hellman data, part 1 [128 octets]
The maximum possible length of authorization data is related to the
enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with
1024 bit = 128 octets long public key without any authorization data
occupies 306 octets (AUTHL is only used when AUTHT has a value != 0),
plus 58 octets for hybrid public key encryption (see
section 5.1 of tor-spec on hybrid encryption of CREATE cells). The
surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110
of the 498 available octets free, which must be shared between
authorization data to the introduction point _and_ to the hidden
service.
When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has
provided valid authorization data to him. He also requires that the
timestamp is no more than 30 minutes in the past or future and that the
first part of the Diffie-Hellman handshake has not been used in the past
60 minutes to prevent replay attacks by rogue introduction points. (The
reason for not using the rendezvous cookie to detect replays---even
though it is only sent once in the current design---is that it might be
desirable to re-use rendezvous cookies for multiple introduction requests
in the future.) If all checks pass, Bob builds a circuit to the provided
rendezvous point. Otherwise he drops the cell.
1.4. Summary of authorization data fields
In summary, the proposed descriptor format and cell formats provide the
following fields for carrying authorization data:
(1) The v2 hidden service descriptor contains:
- a descriptor cookie that is used for the lookup process, and
- an arbitrary encryption schema to ensure authorization to access
introduction information (currently symmetric encryption with the
descriptor cookie).
(2) For performing authorization at the introduction point we can use:
- the fields intro-authorization and service-authorization in
hidden service descriptors,
- a maximum of 215 octets in the ESTABLISH_INTRO cell, and
- one part of 110 octets in the INTRODUCE1V cell.
(3) For performing authorization at the hidden service we can use:
- the fields intro-authorization and service-authorization in
hidden service descriptors,
- the other part of 110 octets in the INTRODUCE2 cell.
It will also still be possible to access a hidden service without any
authorization or only use a part of the authorization infrastructure.
However, this requires to consider all parts of the infrastructure. For
example, authorization at the introduction point relying on confidential
intro-authorization data transported in the hidden service descriptor
cannot be performed without using an encryption schema for introduction
information.
1.5. Managing authorization data at servers and clients
In order to provide authorization data at the hidden service and the
authenticated clients, we propose to use files---either the Tor
configuration file or separate files. The exact format of these special
files depends on the authorization protocol used.
Currently, rend-spec contains the proposition to encode client-side
authorization data in the URL, like in x.y.z.onion. This was never used
and is also a bad idea, because in case of HTTP the requested URL may be
contained in the Host and Referer fields.
1.6. Limitations for authorization protocols
There are two limitations of the current hidden service protocol for
authorization protocols that shall be identified here.
1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2
restricts the amount of data that can be used for authorization.
This forces authorization protocols that require per-user
authorization data at the introduction point to restrict the number
of authorized clients artificially. A possible solution could be to
split contents among multiple cells and reassemble them at the
introduction points.
2. The current hidden service protocol does not specify cell types to
perform interactive authorization between client and introduction
point or hidden service. If there should be an authorization
protocol that requires interaction, new cell types would have to be
defined and integrated into the hidden service protocol.
2. Specific authorization protocol instances
In the following we present two specific authorization protocols that
make use of (parts of) the new authorization infrastructure:
1. The first protocol allows a service provider to restrict access
to clients with a previously received secret key only, but does not
attempt to hide service activity from others.
2. The second protocol, albeit being feasible for a limited set of about
16 clients, performs client authorization and hides service activity
from everyone but the authorized clients.
These two protocol instances extend the existing hidden service protocol
version 2. Hidden services that perform client authorization may run in
parallel to other services running versions 0, 2, or both.
2.1. Service with large-scale client authorization
The first client authorization protocol aims at performing access control
while consuming as few additional resources as possible. A service
provider should be able to permit access to a large number of clients
while denying access for everyone else. However, the price for
scalability is that the service won't be able to hide its activity from
unauthorized or formerly authorized clients.
The main idea of this protocol is to encrypt the introduction-point part
in hidden service descriptors to authorized clients using symmetric keys.
This ensures that nobody else but authorized clients can learn which
introduction points a service currently uses, nor can someone send a
valid INTRODUCE1 message without knowing the introduction key. Therefore,
a subsequent authorization at the introduction point is not required.
A service provider generates symmetric "descriptor cookies" for his
clients and distributes them outside of Tor. The suggested key size is
128 bits, so that descriptor cookies can be encoded in 22 base64 chars
(which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
authorization type (here: "0") and allow a client to distinguish this
authorization protocol from others like the one proposed below).
Typically, the contact information for a hidden service using this
authorization protocol looks like this:
v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz
When generating a hidden service descriptor, the service encrypts the
introduction-point part with a single randomly generated symmetric
128-bit session key using AES-CTR as described for v2 hidden service
descriptors in rend-spec. Afterwards, the service encrypts the session
key to all descriptor cookies using AES. Authorized client should be able
to efficiently find the session key that is encrypted for him/her, so
that 4 octet long client ID are generated consisting of descriptor cookie
and initialization vector. Descriptors always contain a number of
encrypted session keys that is a multiple of 16 by adding fake entries.
Encrypted session keys are ordered by client IDs in order to conceal
addition or removal of authorized clients by the service provider.
ATYPE Authorization type: set to 1. [1 octet]
ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet]
for each symmetric descriptor cookie:
ID Client ID: H(descriptor cookie | IV)[:4] [4 octets]
SKEY Session key encrypted with descriptor cookie [16 octets]
(end of client-specific part)
RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets]
IV AES initialization vector [16 octets]
IPOS Intro points, encrypted with session key [remaining octets]
An authorized client needs to configure Tor to use the descriptor cookie
when accessing the hidden service. Therefore, a user adds the contact
information that she received from the service provider to her torrc
file. Upon downloading a hidden service descriptor, Tor finds the
encrypted introduction-point part and attempts to decrypt it using the
configured descriptor cookie. (In the rare event of two or more client
IDs being equal a client tries to decrypt all of them.)
Upon sending the introduction, the client includes her descriptor cookie
as auth type "1" in the INTRODUCE2 cell that she sends to the service.
The hidden service checks whether the included descriptor cookie is
authorized to access the service and either responds to the introduction
request, or not.
2.2. Authorization for limited number of clients
A second, more sophisticated client authorization protocol goes the extra
mile of hiding service activity from unauthorized clients. With all else
being equal to the preceding authorization protocol, the second protocol
publishes hidden service descriptors for each user separately and gets
along with encrypting the introduction-point part of descriptors to a
single client. This allows the service to stop publishing descriptors for
removed clients. As long as a removed client cannot link descriptors
issued for other clients to the service, it cannot derive service
activity any more. The downside of this approach is limited scalability.
Even though the distributed storage of descriptors (cf. proposal 114)
tackles the problem of limited scalability to a certain extent, this
protocol should not be used for services with more than 16 clients. (In
fact, Tor should refuse to advertise services for more than this number
of clients.)
A hidden service generates an asymmetric "client key" and a symmetric
"descriptor cookie" for each client. The client key is used as
replacement for the service's permanent key, so that the service uses a
different identity for each of his clients. The descriptor cookie is used
to store descriptors at changing directory nodes that are unpredictable
for anyone but service and client, to encrypt the introduction-point
part, and to be included in INTRODUCE2 cells. Once the service has
created client key and descriptor cookie, he tells them to the client
outside of Tor. The contact information string looks similar to the one
used by the preceding authorization protocol (with the only difference
that it has "1" encoded as auth-type in the remaining 4 of 132 bits
instead of "0" as before).
When creating a hidden service descriptor for an authorized client, the
hidden service uses the client key and descriptor cookie to compute
secret ID part and descriptor ID:
secret-id-part = H(time-period | descriptor-cookie | replica)
descriptor-id = H(client-key[:10] | secret-id-part)
The hidden service also replaces permanent-key in the descriptor with
client-key and encrypts introduction-points with the descriptor cookie.
ATYPE Authorization type: set to 2. [1 octet]
IV AES initialization vector [16 octets]
IPOS Intro points, encr. with descriptor cookie [remaining octets]
When uploading descriptors, the hidden service needs to make sure that
descriptors for different clients are not uploaded at the same time (cf.
Section 1.1) which is also a limiting factor for the number of clients.
When a client is requested to establish a connection to a hidden service
it looks up whether it has any authorization data configured for that
service. If the user has configured authorization data for authorization
protocol "2", the descriptor ID is determined as described in the last
paragraph. Upon receiving a descriptor, the client decrypts the
introduction-point part using its descriptor cookie. Further, the client
includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
it sends to the service.
2.3. Hidden service configuration
A hidden service that is meant to perform client authorization adds a
new option HiddenServiceAuthorizeClient to its hidden service
configuration. This option contains the authorization type which is
either "1" for the protocol described in 2.1 or "2" for the protocol in
2.2 and a comma-separated list of human-readable client names, so that
Tor can create authorization data for these clients:
HiddenServiceAuthorizeClient auth-type client-name,client-name,...
If this option is configured, HiddenServiceVersion is automatically
reconfigured to contain only version numbers of 2 or higher.
Tor stores all generated authorization data for the authorization
protocols described in Sections 2.1 and 2.2 in a new file using the
following file format:
"client-name" human-readable client identifier NL
"descriptor-cookie" 128-bit key ^= 22 base64 chars NL
If the authorization protocol of Section 2.2 is used, Tor also generates
and stores the following data:
"client-key" NL a public key in PEM format
2.4. Client configuration
Clients need to make their authorization data known to Tor using another
configuration option that contains a service name (mainly for the sake of
convenience), the service address, and the descriptor cookie that is
required to access a hidden service (the authorization protocol number is
encoded in the descriptor cookie):
HidServAuth service-name service-address descriptor-cookie
Security implications:
In the following we want to discuss possible attacks by dishonest
entities in the presented infrastructure and specific protocol. These
security implications would have to be verified once more when adding
another protocol. The dishonest entities (theoretically) include the
hidden service itself, the authenticated clients, hidden service directory
nodes, introduction points, and rendezvous points. The relays that are
part of circuits used during protocol execution, but never learn about
the exchanged descriptors or cells by design, are not considered.
Obviously, this list makes no claim to be complete. The discussed attacks
are sorted by the difficulty to perform them, in ascending order,
starting with roles that everyone could attempt to take and ending with
partially trusted entities abusing the trust put in them.
(1) A hidden service directory could attempt to conclude presence of a
service from the existence of a locally stored hidden service descriptor:
This passive attack is possible only for a single client-service
relation, because descriptors need to contain a publicly visible
signature of the service using the client key.
A possible protection would be to increase the number of hidden service
directories in the network.
(2) A hidden service directory could try to break the descriptor cookies
of locally stored descriptors: This attack can be performed offline. The
only useful countermeasure against it might be using safe passwords that
are generated by Tor.
[passwords? where did those come in? -RD]
(3) An introduction point could try to identify the pseudonym of the
hidden service on behalf of which it operates: This is impossible by
design, because the service uses a fresh public key for every
establishment of an introduction point (see proposal 114) and the
introduction point receives a fresh introduction cookie, so that there is
no identifiable information about the service that the introduction point
could learn. The introduction point cannot even tell if client accesses
belong to the same client or not, nor can it know the total number of
authorized clients. The only information might be the pattern of
anonymous client accesses, but that is hardly enough to reliably identify
a specific service.
(4) An introduction point could want to learn the identities of accessing
clients: This is also impossible by design, because all clients use the
same introduction cookie for authorization at the introduction point.
(5) An introduction point could try to replay a correct INTRODUCE1 cell
to other introduction points of the same service, e.g. in order to force
the service to create a huge number of useless circuits: This attack is
not possible by design, because INTRODUCE1 cells are encrypted using a
freshly created introduction key that is only known to authorized
clients.
(6) An introduction point could attempt to replay a correct INTRODUCE2
cell to the hidden service, e.g. for the same reason as in the last
attack: This attack is stopped by the fact that a service will drop
INTRODUCE2 cells containing a DH handshake they have seen recently.
(7) An introduction point could block client requests by sending either
positive or negative INTRODUCE_ACK cells back to the client, but without
forwarding INTRODUCE2 cells to the server: This attack is an annoyance
for clients, because they might wait for a timeout to elapse until trying
another introduction point. However, this attack is not introduced by
performing authorization and it cannot be targeted towards a specific
client. A countermeasure might be for the server to periodically perform
introduction requests to his own service to see if introduction points
are working correctly.
(8) The rendezvous point could attempt to identify either server or
client: This remains impossible as it was before, because the
rendezvous cookie does not contain any identifiable information.
(9) An authenticated client could swamp the server with valid INTRODUCE1
and INTRODUCE2 cells, e.g. in order to force the service to create
useless circuits to rendezvous points; as opposed to an introduction
point replaying the same INTRODUCE2 cell, a client could include a new
rendezvous cookie for every request: The countermeasure for this attack
is the restriction to 10 connection establishments per client per hour.
Compatibility:
An implementation of this proposal would require changes to hidden
services and clients to process authorization data and encode and
understand the new formats. However, both services and clients would
remain compatible to regular hidden services without authorization.
Implementation:
The implementation of this proposal can be divided into a number of
changes to hidden service and client side. There are no
changes necessary on directory, introduction, or rendezvous nodes. All
changes are marked with either [service] or [client] do denote on which
side they need to be made.
/1/ Configure client authorization [service]
- Parse configuration option HiddenServiceAuthorizeClient containing
authorized client names.
- Load previously created client keys and descriptor cookies.
- Generate missing client keys and descriptor cookies, add them to
client_keys file.
- Rewrite the hostname file.
- Keep client keys and descriptor cookies of authorized clients in
memory.
[- In case of reconfiguration, mark which client authorizations were
added and whether any were removed. This can be used later when
deciding whether to rebuild introduction points and publish new
hidden service descriptors. Not implemented yet.]
/2/ Publish hidden service descriptors [service]
- Create and upload hidden service descriptors for all authorized
clients.
[- See /1/ for the case of reconfiguration.]
/3/ Configure permission for hidden services [client]
- Parse configuration option HidServAuth containing service
authorization, store authorization data in memory.
/5/ Fetch hidden service descriptors [client]
- Look up client authorization upon receiving a hidden service request.
- Request hidden service descriptor ID including client key and
descriptor cookie. Only request v2 descriptors, no v0.
/6/ Process hidden service descriptor [client]
- Decrypt introduction points with descriptor cookie.
/7/ Create introduction request [client]
- Include descriptor cookie in INTRODUCE2 cell to introduction point.
- Pass descriptor cookie around between involved connections and
circuits.
/8/ Process introduction request [service]
- Read descriptor cookie from INTRODUCE2 cell.
- Check whether descriptor cookie is authorized for access, including
checking access counters.
- Log access for accountability.

View File

@ -1,136 +0,0 @@
Filename: 122-unnamed-flag.txt
Title: Network status entries need a new Unnamed flag
Author: Roger Dingledine
Created: 04-Oct-2007
Status: Closed
Implemented-In: 0.2.0.x
1. Overview:
Tor's directory authorities can give certain servers a "Named" flag
in the network-status entry, when they want to bind that nickname to
that identity key. This allows clients to specify a nickname rather
than an identity fingerprint and still be certain they're getting the
"right" server. As dir-spec.txt describes it,
Name X is bound to identity Y if at least one binding directory lists
it, and no directory binds X to some other Y'.
In practice, clients can refer to servers by nickname whether they are
Named or not; if they refer to nicknames that aren't Named, a complaint
shows up in the log asking them to use the identity key in the future
--- but it still works.
The problem? Imagine a Tor server with nickname Bob. Bob and his
identity fingerprint are registered in tor26's approved-routers
file, but none of the other authorities registered him. Imagine
there are several other unregistered servers also with nickname Bob
("the imposters").
While Bob is online, all is well: a) tor26 gives a Named flag to
the real one, and refuses to list the other ones; and b) the other
authorities list the imposters but don't give them a Named flag. Clients
who have all the network-statuses can compute which one is the real Bob.
But when the real Bob disappears and his descriptor expires? tor26
continues to refuse to list any of the imposters, and the other
authorities continue to list the imposters. Clients don't have any
idea that there exists a Named Bob, so they can ask for server Bob and
get one of the imposters. (A warning will also appear in their log,
but so what.)
2. The stopgap solution:
tor26 should start accepting and listing the imposters, but it should
assign them a new flag: "Unnamed".
This would produce three cases in terms of assigning flags in the consensus
networkstatus:
i) a router gets the Named flag in the v3 networkstatus if
a) it's the only router with that nickname that has the Named flag
out of all the votes, and
b) no vote lists it as Unnamed
else,
ii) a router gets the Unnamed flag if
a) some vote lists a different router with that nickname as Named, or
b) at least one vote lists it as Unnamed, or
c) there are other routers with the same nickname that are Unnamed
else,
iii) the router neither gets a Named nor an Unnamed flag.
(This whole proposal is meant only for v3 dir flags; we shouldn't try
to backport it to the v2 dir world.)
Then client behavior is:
a) If there's a Bob with a Named flag, pick that one.
else b) If the Bobs don't have the Unnamed flag (notice that they should
either all have it, or none), pick one of them and warn.
else c) They all have the Unnamed flag -- no router found.
3. Problems not solved by this stopgap:
3.1. Naming authorities can go offline.
If tor26 is the only authority that provides a binding for Bob, when
tor26 goes offline we're back in our previous situation -- the imposters
can be referenced with a mere ignorable warning in the client's log.
If some other authority Names a different Bob, and tor26 goes offline,
then that other Bob becomes the unique Named Bob.
So be it. We should try to solve these one day, but there's no clear way
to do it that doesn't destroy usability in other ways, and if we want
to get the Unnamed flag into v3 network statuses we should add it soon.
3.2. V3 dir spec magnifies brief discrepancies.
Another point to notice is if tor26 names Bob(1), doesn't know about
Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag
even if it should (and Bob(1) is not around).
Right now, in v2 dirs, the case where an authority doesn't know about
a server but the other authorities do know is rare. That's because
authorities periodically ask for other networkstatuses and then fetch
descriptors that are missing.
With v3, if that window occurs at the wrong time, it is extended for the
entire period. We could solve this by making the voting more complex,
but that doesn't seem worth it.
[3.3. Tor26 is only one tor26.
We need more naming authorities, possibly with some kind of auto-naming
feature. This is out-of-scope for this proposal -NM]
4. Changes to the v2 directory
Previously, v2 authorities that had a binding for a server named Bob did
not list any other server named Bob. This will change too:
Version 2 authorities will start listing all routers they know about,
whether they conflict with a name-binding or not: Servers for which
this authority has a binding will continue to be marked Named,
additionally all other servers of that nickname will be listed without the
Named flag (i.e. there will be no Unnamed flag in v2 status documents).
Clients already should handle having a named Bob alongside unnamed
Bobs correctly, and having the unnamed Bobs in the status file even
without the named server is no worse than the current status quo where
clients learn about those servers from other authorities.
The benefit of this is that an authority's opinion on a server like
Guard, Stable, Fast etc. can now be learned by clients even if that
specific authority has reserved that server's name for somebody else.
5. Other benefits:
This new flag will allow people to operate servers that happen to have
the same nickname as somebody who registered their server two years ago
and left soon after. Right now there are dozens of nicknames that are
registered on all three binding directory authorities, yet haven't been
running for years. While it's bad that these nicknames are effectively
blacklisted from the network, the really bad part is that this logic
is really unintuitive to prospective new server operators.

View File

@ -1,54 +0,0 @@
Filename: 123-autonaming.txt
Title: Naming authorities automatically create bindings
Author: Peter Palfrader
Created: 2007-10-11
Status: Closed
Implemented-In: 0.2.0.x
Overview:
Tor's directory authorities can give certain servers a "Named" flag
in the network-status entry, when they want to bind that nickname to
that identity key. This allows clients to specify a nickname rather
than an identity fingerprint and still be certain they're getting the
"right" server.
Authority operators name a server by adding their nickname and
identity fingerprint to the 'approved-routers' file. Historically
being listed in the file was required for a router, at first for being
listed in the directory at all, and later in order to be used by
clients as a first or last hop of a circuit.
Adding identities to the list of named routers so far has been a
manual, time consuming, and boring job. Given that and the fact that
the Tor network works just fine without named routers the last
authority to keep a current binding list stopped updating it well over
half a year ago.
Naming, if it were done, would serve a useful purpose however in that
users can have a reasonable expectation that the exit server Bob they
are using in their http://www.google.com.bob.exit/ URL is the same
Bob every time.
Proposal:
I propose that identity<->name binding be completely automated:
New bindings should be added after the router has been around for a
bit and their name has not been used by other routers, similarly names
that have not appeared on the network for a long time should be freed
in case a new router wants to use it.
The following rules are suggested:
i) If a named router has not been online for half a year, the
identity<->name binding for that name is removed. The nickname
is free to be taken by other routers now.
ii) If a router claims a certain nickname and
a) has been on the network for at least two weeks, and
b) that nickname is not yet linked to a different router, and
c) no other router has wanted that nickname in the last month,
a new binding should be created for this router and its desired
nickname.
This automaton does not necessarily need to live in the Tor code, it
can do its job just as well when it's an external tool.

View File

@ -1,313 +0,0 @@
Filename: 124-tls-certificates.txt
Title: Blocking resistant TLS certificate usage
Author: Steven J. Murdoch
Created: 2007-10-25
Status: Superseded
Overview:
To be less distinguishable from HTTPS web browsing, only Tor servers should
present TLS certificates. This should be done whilst maintaining backwards
compatibility with Tor nodes which present and expect client certificates, and
while preserving existing security properties. This specification describes
the negotiation protocol, what certificates should be presented during the TLS
negotiation, and how to move the client authentication within the encrypted
tunnel.
Motivation:
In Tor's current TLS [1] handshake, both client and server present a
two-certificate chain. Since TLS performs authentication prior to establishing
the encrypted tunnel, the contents of these certificates are visible to an
eavesdropper. In contrast, during normal HTTPS web browsing, the server
presents a single certificate, signed by a root CA and the client presents no
certificate. Hence it is possible to distinguish Tor from HTTP by identifying
this pattern.
To resist blocking based on traffic identification, Tor should behave as close
to HTTPS as possible, i.e. servers should offer a single certificate and not
request a client certificate; clients should present no certificate. This
presents two difficulties: clients are no longer authenticated and servers are
authenticated by the connection key, rather than identity key. The link
protocol must thus be modified to preserve the old security semantics.
Finally, in order to maintain backwards compatibility, servers must correctly
identify whether the client supports the modified certificate handling. This
is achieved by modifying the cipher suites that clients advertise support
for. These cipher suites are selected to be similar to those chosen by web
browsers, in order to resist blocking based on client hello.
Terminology:
Initiator: OP or OR which initiates a TLS connection ("client" in TLS
terminology)
Responder: OR which receives an incoming TLS connection ("server" in TLS
terminology)
Version negotiation and cipher suite selection:
In the modified TLS handshake, the responder does not request a certificate
from the initiator. This request would normally occur immediately after the
responder receives the client hello (the first message in a TLS handshake) and
so the responder must decide whether to request a certificate based only on
the information in the client hello. This is achieved by examining the cipher
suites in the client hello.
List 1: cipher suites lists offered by version 0/1 Tor
From src/common/tortls.c, revision 12086:
TLS1_TXT_DHE_RSA_WITH_AES_128_SHA
TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
Client hello sent by initiator:
Initiators supporting version 2 of the Tor connection protocol MUST
offer a different cipher suite list from those sent by pre-version 2
Tors, contained in List 1. To maintain compatibility with older Tor
versions and common browsers, the cipher suite list MUST include
support for:
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
Client hello received by responder/server hello sent by responder:
Responders supporting version 2 of the Tor connection protocol should compare
the cipher suite list in the client hello with those in List 1. If it matches
any in the list then the responder should assume that the initiatior supports
version 1, and thus should maintain the version 1 behavior, i.e. send a
two-certificate chain, request a client certificate and do not send or expect
a VERSIONS cell [2].
Otherwise, the responder should assume version 2 behavior and select a cipher
suite following TLS [1] behavior, i.e. select the first entry from the client
hello cipher list which is acceptable. Responders MUST NOT select any suite
that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits,
or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT
allow other SSLv3 ciphersuites.
Should no mutually acceptable cipher suite be found, the connection MUST be
closed.
If the responder is implementing version 2 of the connection protocol it
SHOULD send a server certificate with random contents. The organizationName
field MUST NOT be "Tor", "TOR" or "t o r".
Server certificate received by initiator:
If the server certificate has an organizationName of "Tor", "TOR" or "t o r",
the initiator should assume that the responder does not support version 2 of
the connection protocol. In which case the initiator should respond following
version 1, i.e. send a two-certificate client chain and do not send or expect
a VERSIONS cell.
[SJM: We could also use the fact that a client certificate request was sent]
If the server hello contains a ciphersuite which does not comply with the key
length requirements above, even if it was one offered in the client hello, the
connection MUST be closed. This will only occur if the responder is not a Tor
server.
Backward compatibility:
v1 Initiator, v1 Responder: No change
v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello
v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator
detects v1 server certificate and continues with v1 protocol
v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator
detects v2 server certificate and continues with v2 protocol.
Additional link authentication process:
Following VERSION and NETINFO negotiation, both responder and
initiator MUST send a certification chain in a CERT cell. If one
party does not have a certificate, the CERT cell MUST still be sent,
but with a length of zero.
A CERT cell is a variable length cell, of the format
CircID [2 bytes]
Command [1 byte]
Length [2 bytes]
Payload [<length> bytes]
CircID MUST set to be 0x0000
Command is [SJM: TODO]
Length is the length of the payload
Payload contains 0 or more certificates, each is of the format:
Cert_Length [2 bytes]
Certificate [<cert_length> bytes]
Each certificate MUST sign the one preceding it. The initator MUST
place its connection certificate first; the responder, having
already sent its connection certificate as part of the TLS handshake
MUST place its identity certificate first.
Initiators who send a CERT cell MUST follow that with an LINK_AUTH
cell to prove that they posess the corresponding private key.
A LINK_AUTH cell is fixed-lenth, of the format:
CircID [2 bytes]
Command [1 byte]
Length [2 bytes]
Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes]
CircID MUST set to be 0x0000
Command is [SJM: TODO]
Length is the valid portion of the payload
Payload is of the format:
Signature version [1 byte]
Signature [<length> - 1 bytes]
Padding [PAYLOAD_LEN - <length> - 2 bytes]
Signature version: Identifies the type of signature, currently 0x00
Signature: Digital signature under the initiator's connection key of the
following item, in PKCS #1 block type 1 [3] format:
HMAC-SHA1, using the TLS master secret as key, of the
following elements concatenated:
- The signature version (0x00)
- The NUL terminated ASCII string: "Tor initiator certificate verification"
- client_random, as sent in the Client Hello
- server_random, as sent in the Server Hello
- SHA-1 hash of the initiator connection certificate
- SHA-1 hash of the responder connection certificate
Security checks:
- Before sending a LINK_AUTH cell, a node MUST ensure that the TLS
connection is authenticated by the responder key.
- For the handshake to have succeeded, the initiator MUST confirm:
- That the TLS handshake was authenticated by the
responder connection key
- That the responder connection key was signed by the first
certificate in the CERT cell
- That each certificate in the CERT cell was signed by the
following certificate, with the exception of the last
- That the last certificate in the CERT cell is the expected
identity certificate for the node being connected to
- For the handshake to have succeeded, the responder MUST confirm
either:
A) - A zero length CERT cell was sent and no LINK_AUTH cell was
sent
In which case the responder shall treat the identity of the
initiator as unknown
or
B) - That the LINK_AUTH MAC contains a signature by the first
certificate in the CERT cell
- That the MAC signed matches the expected value
- That each certificate in the CERT cell was signed by the
following certificate, with the exception of the last
In which case the responder shall treat the identity of the
initiator as that of the last certificate in the CERT cell
Protocol summary:
1. I(nitiator) <-> R(esponder): TLS handshake, including responder
authentication under connection certificate R_c
2. I <->: VERSION and NETINFO negotiation
3. R -> I: CERT (Responder identity certificate R_i (which signs R_c))
4. I -> R: CERT (Initiator connection certificate I_c,
Initiator identity certificate I_i (which signs I_c)
5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret,
"Tor initiator certificate verification" ||
client_random || server_random ||
I_c hash || R_c hash)
Notes: I -> R doesn't need to wait for R_i before sending its own
messages (reduces round-trips).
Certificate hash is calculated like identity hash in CREATE cells.
Initiator signature is calculated in a similar way to Certificate
Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7).
If I is an OP, a zero length certificate chain may be sent in step 4;
In which case, step 5 is not performed
Rationale:
- Version and netinfo negotiation before authentication: The version cell needs
to come before before the rest of the protocol, since we may choose to alter
the rest at some later point, e.g switch to a different MAC/signature scheme.
It is useful to keep the NETINFO and VERSION cells close to each other, since
the time between them is used to check if there is a delay-attack. Still, a
server might want to not act on NETINFO data from an initiator until the
authentication is complete.
Appendix A: Cipher suite choices
This specification intentionally does not put any constraints on the
TLS ciphersuite lists presented by clients, other than a minimum
required for compatibility. However, to maximize blocking
resistance, ciphersuite lists should be carefully selected.
Recommended client ciphersuite list
Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h
0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA
0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA
0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA
0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA
0x0035: TLS_RSA_WITH_AES_256_CBC_SHA
0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA
0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA
0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA
0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA
0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA
0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA
0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA
0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA
0x0004: SSL_RSA_WITH_RC4_128_MD5
0x0005: SSL_RSA_WITH_RC4_128_SHA
0x002f: TLS_RSA_WITH_AES_128_CBC_SHA
0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA
0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA
0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA
0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA
0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC)
0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA
Order specified in:
http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47
Recommended options:
0x0000: Server Name Indication [4]
0x000a: Supported Elliptic Curves [5]
0x000b: Supported Point Formats [5]
Recommended compression:
0x00
Recommended server ciphersuite selection:
The responder should select the first entry in this list which is
listed in the client hello:
0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ]
0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ]
0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ]
0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ]
References:
[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF
[2] Version negotiation for the Tor protocol, Tor proposal 105
[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1:
RSA Cryptography Specifications Version 1.5", RFC 2313,
March 1998.
[4] TLS Extensions, RFC 3546
[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS)
% <!-- Local IspellDict: american -->

View File

@ -1,291 +0,0 @@
Filename: 125-bridges.txt
Title: Behavior for bridge users, bridge relays, and bridge authorities
Author: Roger Dingledine
Created: 11-Nov-2007
Status: Closed
Implemented-In: 0.2.0.x
0. Preface
This document describes the design decisions around support for bridge
users, bridge relays, and bridge authorities. It acts as an overview
of the bridge design and deployment for developers, and it also tries
to point out limitations in the current design and implementation.
For more details on what all of these mean, look at blocking.tex in
/doc/design-paper/
1. Bridge relays
Bridge relays are just like normal Tor relays except they don't publish
their server descriptors to the main directory authorities.
1.1. PublishServerDescriptor
To configure your relay to be a bridge relay, just add
BridgeRelay 1
PublishServerDescriptor bridge
to your torrc. This will cause your relay to publish its descriptor
to the bridge authorities rather than to the default authorities.
Alternatively, you can say
BridgeRelay 1
PublishServerDescriptor 0
which will cause your relay to not publish anywhere. This could be
useful for private bridges.
1.2. Exit policy
Bridge relays should use an exit policy of "reject *:*". This is
because they only need to relay traffic between the bridge users
and the rest of the Tor network, so there's no need to let people
exit directly from them.
1.3. RelayBandwidthRate / RelayBandwidthBurst
We invented the RelayBandwidth* options for this situation: Tor clients
who want to allow relaying too. See proposal 111 for details. Relay
operators should feel free to rate-limit their relayed traffic.
1.4. Helping the user with port forwarding, NAT, etc.
Just as for operating normal relays, our documentation and hints for
how to make your ORPort reachable are inadequate for normal users.
We need to work harder on this step, perhaps in 0.2.2.x.
1.5. Vidalia integration
Vidalia has turned its "Relay" settings page into a tri-state
"Don't relay" / "Relay for the Tor network" / "Help censored users".
If you click the third choice, it forces your exit policy to reject *:*.
If all the bridges end up on port 9001, that's not so good. On the
other hand, putting the bridges on a low-numbered port in the Unix
world requires jumping through extra hoops. The current compromise is
that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
other platforms.
At the bottom of the relay config settings window, Vidalia displays
the bridge identifier to the operator (see Section 3.1) so he can pass
it on to bridge users.
1.6. What if the default ORPort is already used?
If the user already has a webserver or some other application
bound to port 443, then Tor will fail to bind it and complain to the
user, probably in a cryptic way. Rather than just working on a better
error message (though we should do this), we should consider an
"ORPort auto" option that tells Tor to try to find something that's
bindable and reachable. This would also help us tolerate ISPs that
filter incoming connections on port 80 and port 443. But this should
be a different proposal, and can wait until 0.2.2.x.
2. Bridge authorities.
Bridge authorities are like normal directory authorities, except they
don't create their own network-status documents or votes. So if you
ask an authority for a network-status document or consensus, they
behave like a directory mirror: they give you one from one of the main
authorities. But if you ask the bridge authority for the descriptor
corresponding to a particular identity fingerprint, it will happily
give you the latest descriptor for that fingerprint.
To become a bridge authority, add these lines to your torrc:
AuthoritativeDirectory 1
BridgeAuthoritativeDir 1
Right now there's one bridge authority, running on the Tonga relay.
2.1. Exporting bridge-purpose descriptors
We've added a new purpose for server descriptors: the "bridge"
purpose. With the new router-descriptors file format that includes
annotations, it's easy to look through it and find the bridge-purpose
descriptors.
Currently we export the bridge descriptors from Tonga to the
BridgeDB server, so it can give them out according to the policies
in blocking.pdf.
2.2. Reachability/uptime testing
Right now the bridge authorities do active reachability testing of
bridges, so we know which ones to recommend for users.
But in the design document, we suggested that bridges should publish
anonymously (i.e. via Tor) to the bridge authority, so somebody watching
the bridge authority can't just enumerate all the bridges. But if we're
doing active measurement, the game is up. Perhaps we should back off on
this goal, or perhaps we should do our active measurement anonymously?
Answering this issue is scheduled for 0.2.1.x.
2.3. Migrating to multiple bridge authorities
Having only one bridge authority is both a trust bottleneck (if you
break into one place you learn about every single bridge we've got)
and a robustness bottleneck (when it's down, bridge users become sad).
Right now if we put up a second bridge authority, all the bridges would
publish to it, and (assuming the code works) bridge users would query
a random bridge authority. This resolves the robustness bottleneck,
but makes the trust bottleneck even worse.
In 0.2.2.x and later we should think about better ways to have multiple
bridge authorities.
3. Bridge users.
Bridge users are like ordinary Tor users except they use encrypted
directory connections by default, and they use bridge relays as both
entry guards (their first hop) and directory guards (the source of
all their directory information).
To become a bridge user, add the following line to your torrc:
UseBridges 1
and then add at least one "Bridge" line to your torrc based on the
format below.
3.1. Format of the bridge identifier.
The canonical format for a bridge identifier contains an IP address,
an ORPort, and an identity fingerprint:
bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
However, the identity fingerprint can be left out, in which case the
bridge user will connect to that relay and use it as a bridge regardless
of what identity key it presents:
bridge 128.31.0.34:9009
This might be useful for cases where only short bridge identifiers
can be communicated to bridge users.
In a future version we may also support bridge identifiers that are
only a key fingerprint:
bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
and the bridge user can fetch the latest descriptor from the bridge
authority (see Section 3.4).
3.2. Bridges as entry guards
For now, bridge users add their bridge relays to their list of "entry
guards" (see path-spec.txt for background on entry guards). They are
managed by the entry guard algorithms exactly as if they were a normal
entry guard -- their keys and timing get cached in the "state" file,
etc. This means that when the Tor user starts up with "UseBridges"
disabled, he will skip past the bridge entries since they won't be
listed as up and usable in his networkstatus consensus. But to be clear,
the "entry_guards" list doesn't currently distinguish guards by purpose.
Internally, each bridge user keeps a smartlist of "bridge_info_t"
that reflects the "bridge" lines from his torrc along with a download
schedule (see Section 3.5 below). When he starts Tor, he attempts
to fetch a descriptor for each configured bridge (see Section 3.4
below). When he succeeds at getting a descriptor for one of the bridges
in his list, he adds it directly to the entry guard list using the
normal add_an_entry_guard() interface. Once a bridge descriptor has
been added, should_delay_dir_fetches() will stop delaying further
directory fetches, and the user begins to bootstrap his directory
information from that bridge (see Section 3.3).
Currently bridge users cache their bridge descriptors to the
"cached-descriptors" file (annotated with purpose "bridge"), but
they don't make any attempt to reuse descriptors they find in this
file. The theory is that either the bridge is available now, in which
case you can get a fresh descriptor, or it's not, in which case an
old descriptor won't do you much good.
We could disable writing out the bridge lines to the state file, if
we think this is a problem.
As an exception, if we get an application request when we have one
or more bridge descriptors but we believe none of them are running,
we mark them all as running again. This is similar to the exception
already in place to help long-idle Tor clients realize they should
fetch fresh directory information rather than just refuse requests.
3.3. Bridges as directory guards
In addition to using bridges as the first hop in their circuits, bridge
users also use them to fetch directory updates. Other than initial
bootstrapping to find a working bridge descriptor (see Section 3.4
below), all further non-anonymized directory fetches will be redirected
to the bridge.
This means that bridge relays need to have cached answers for all
questions the bridge user might ask. This makes the upgrade path
tricky --- for example, if we migrate to a v4 directory design, the
bridge user would need to keep using v3 so long as his bridge relays
only knew how to answer v3 queries.
In a future design, for cases where the user has enough information
to build circuits yet the chosen bridge doesn't know how to answer a
given query, we might teach bridge users to make an anonymized request
to a more suitable directory server.
3.4. How bridge users get their bridge descriptor
Bridge users can fetch bridge descriptors in two ways: by going directly
to the bridge and asking for "/tor/server/authority", or by going to
the bridge authority and asking for "/tor/server/fp/ID". By default,
they will only try the direct queries. If the user sets
UpdateBridgesFromAuthority 1
in his config file, then he will try querying the bridge authority
first for bridges where he knows a digest (if he only knows an IP
address and ORPort, then his only option is a direct query).
If the user has at least one working bridge, then he will do further
queries to the bridge authority through a full three-hop Tor circuit.
But when bootstrapping, he will make a direct begin_dir-style connection
to the bridge authority.
As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
from the bridge authority and it returns a 404 not found, the user
will automatically fall back to trying a direct query. Therefore it is
recommended that bridge users always set UpdateBridgesFromAuthority,
since at worst it will delay their fetches a little bit and notify
the bridge authority of the identity fingerprint (but not location)
of their intended bridges.
3.5. Bridge descriptor retry schedule
Bridge users try to fetch a descriptor for each bridge (using the
steps in Section 3.4 above) on startup. Whenever they receive a
bridge descriptor, they reschedule a new descriptor download for 1
hour from then.
If on the other hand it fails, they try again after 15 minutes for the
first attempt, after 15 minutes for the second attempt, and after 60
minutes for subsequent attempts.
In 0.2.2.x we should come up with some smarter retry schedules.
3.6. Vidalia integration
Vidalia 0.0.16 has a checkbox in its Network config window called
"My ISP blocks connections to the Tor network." Users who click that
box change their configuration to:
UseBridges 1
UpdateBridgesFromAuthority 1
and should specify at least one Bridge identifier.
3.7. Do we need a second layer of entry guards?
If the bridge user uses the bridge as its entry guard, then the
triangulation attacks from Lasse and Paul's Oakland paper work to
locate the user's bridge(s).
Worse, this is another way to enumerate bridges: if the bridge users
keep rotating through second hops, then if you run a few fast servers
(and avoid getting considered an Exit or a Guard) you'll quickly get
a list of the bridges in active use.
That's probably the strongest reason why bridge users will need to
pick second-layer guards. Would this mean bridge users should switch
to four-hop circuits?
We should figure this out in the 0.2.1.x timeframe.

View File

@ -1,410 +0,0 @@
Filename: 126-geoip-reporting.txt
Title: Getting GeoIP data and publishing usage summaries
Author: Roger Dingledine
Created: 2007-11-24
Status: Closed
Implemented-In: 0.2.0.x
0. Status
In 0.2.0.x, this proposal is implemented to the extent needed to
address its motivations. See notes below with the test "RESOLUTION"
for details.
1. Background and motivation
Right now we can keep a rough count of Tor users, both total and by
country, by watching connections to a single directory mirror. Being
able to get usage estimates is useful both for our funders (to
demonstrate progress) and for our own development (so we know how
quickly we're scaling and can design accordingly, and so we know which
countries and communities to focus on more). This need for information
is the only reason we haven't deployed "directory guards" (think of
them like entry guards but for directory information; in practice,
it would seem that Tor clients should simply use their entry guards
as their directory guards; see also proposal 125).
With the move toward bridges, we will no longer be able to track Tor
clients that use bridges, since they use their bridges as directory
guards. Further, we need to be able to learn which bridges stop seeing
use from certain countries (and are thus likely blocked), so we can
avoid giving them out to other users in those countries.
Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays
and circuits on its 'network map', and it performs anonymized GeoIP
lookups to its central servers to know where to put the dots. Vidalia
caches answers it gets -- to reduce delay, to reduce overhead on
the network, and to reduce anonymity issues where users reveal their
knowledge about the network through which IP addresses they ask about.
But with the advent of bridges, Tor clients are asking about IP
addresses that aren't in the main directory. In particular, bridge
users inform the central Vidalia servers about each bridge as they
discover it and their Vidalia tries to map it.
Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's
own IP address, so it can provide a more useful map.
Finally, Vidalia's central servers leave users open to partitioning
attacks, even if they can't target specific users. Further, as we
start using GeoIP results for more operational or security-relevant
goals, such as avoiding or including particular countries in circuits,
it becomes more important that users can't be singled out in terms of
their IP-to-country mapping beliefs.
2. The available GeoIP databases
There are at least two classes of GeoIP database out there: "IP to
country", which tells us the country code for the IP address but
no more details, and "IP to city", which tells us the country code,
the name of the city, and some basic latitude/longitude guesses.
A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252
bytes. A typical line is:
"205500992","208605279","US","USA","UNITED STATES"
http://ip-to-country.webhosting.info/node/view/5
Similarly, the maxmind GeoLite Country database is also about 500KB
compressed.
http://www.maxmind.com/app/geolitecountry
The maxmind GeoLite City database gives more finegrained detail like
geo coordinates and city name. Vidalia currently makes use of this
information. On the other hand it's 16MB compressed. A typical line is:
206.124.149.146,Bellevue,WA,US,47.6051,-122.1134
http://www.maxmind.com/app/geolitecity
There are other databases out there, like
http://www.hostip.info/faq.html
http://www.webconfs.com/ip-to-city.php
that want more attention, but for now let's assume that all the db's
are around this size.
3. What we'd like to solve
Goal #1a: Tor relays collect IP-to-country user stats and publish
sanitized versions.
Goal #1b: Tor bridges collect IP-to-country user stats and publish
sanitized versions.
Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better
mapping.
Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user
can pick countries for her paths.
Goal #3: Vidalia doesn't do external lookups on bridge relay addresses.
Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city
for better mapping.
Goal #5: Reduce partitioning opportunities where Vidalia central
servers can give different (distinguishing) responses.
4. Solution overview
Our goal is to allow Tor relays, bridges, and clients to learn enough
GeoIP information so they can do local private queries.
4.1. The IP-to-country db
Directory authorities should publish a "geoip" file that contains
IP-to-country mappings. Directory caches will mirror it, and Tor clients
and relays (including bridge relays) will fetch it. Thus we can solve
goals 1a and 1b (publish sanitized usage info). Controllers could also
use this to solve goal 2b (choosing path by country attributes). It
also solves goal 4 (learning the Tor client's country), though for
huge countries like the US we'd still need to decide where the "middle"
should be when we're mapping that address.
The IP-to-country details are described further in Sections 5 and
6 below.
[RESOLUTION: The geoip file in 0.2.0.x is not distributed through
Tor. Instead, it is shipped with the bundle.]
4.2. The IP-to-city db
In an ideal world, the IP-to-city db would be small enough that we
could distribute it in the above manner too. But for now, it is too
large. Here's where the design choice forks.
Option A: Vidalia should continue doing its anonymized IP-to-city
queries. Thus we can achieve goals 2a and 2b. We would solve goal
3 by only doing lookups on descriptors that are purpose "general"
(see Section 4.2.1 for how). We would leave goal 5 unsolved.
Option B: Each directory authority should keep an IP-to-city db,
lookup the value for each router it lists, and include that line in
the router's network-status entry. The network-status consensus would
then use the line that appears in the majority of votes. This approach
also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups
at all now), and goal 5 (reduced partitioning risks).
Option B has the advantage that Vidalia can simplify its operation,
and the advantage that this consensus IP-to-city data is available to
other controllers besides just Vidalia. But it has the disadvantage
that the networkstatus consensus becomes larger, even though most of
the GeoIP information won't change from one consensus to the next. Is
there another reasonable location for it that can provide similar
consensus security properties?
[RESOLUTION: IP-to-city is not supported.]
4.2.1. Controllers can query for router annotations
Vidalia needs to stop doing queries on bridge relay IP addresses.
It could do that by only doing lookups on descriptors that are in
the networkstatus consensus, but that precludes designs like Blossom
that might want to map its relay locations. The best answer is that it
should learn the router annotations, with a new controller 'getinfo'
command:
"GETINFO desc-annotations/id/<OR identity>"
which would respond with something like
@downloaded-at 2007-11-29 08:06:38
@source "128.31.0.34"
@purpose bridge
[We could also make the answer include the digest for the router in
question, which would enable us to ask GETINFO router-annotations/all.
Is this worth it? -RD]
Then Vidalia can avoid doing lookups on descriptors with purpose
"bridge". Even better would be to add a new annotation "@private true"
so Vidalia can know how to handle new purposes that we haven't created
yet. Vidalia could special-case "bridge" for now, for compatibility
with the current 0.2.0.x-alphas.
4.3. Recommendation
My overall recommendation is that we should implement 4.1 soon
(e.g. early in 0.2.1.x), and we can go with 4.2 option A for now,
with the hope that later we discover a better way to distribute the
IP-to-city info and can switch to 4.2 option B.
Below we discuss more how to go about achieving 4.1.
5. Publishing and caching the GeoIP (IP-to-country) database
Each v3 directory authority should put a copy of the "geoip" file in
its datadirectory. Then its network-status votes should include a hash
of this file (Recommended-geoip-hash: %s), and the resulting consensus
directory should specify the consensus hash.
There should be a new URL for fetching this geoip db (by "current.z"
for testing purposes, and by hash.z for typical downloads). Authorities
should fetch and serve the one listed in the consensus, even when they
vote for their own. This would argue for storing the cached version
in a better filename than "geoip".
Directory mirrors should keep a copy of this file available via the
same URLs.
We assume that the file would change at most a few times a month. Should
Tor ship with a bootstrap geoip file? An out-of-date geoip file may
open you up to partitioning attacks, but for the most part it won't
be that different.
There should be a config option to disable updating the geoip file,
in case users want to use their own file (e.g. they have a proprietary
GeoIP file they prefer to use). In that case we leave it up to the
user to update his geoip file out-of-band.
[XXX Should consider forward/backward compatibility, e.g. if we want
to move to a new geoip file format. -RD]
[RESOLUTION: Not done over Tor.]
6. Controllers use the IP-to-country db for mapping and for path building
Down the road, Vidalia could use the IP-to-country mappings for placing
on its map:
- The location of the client
- The location of the bridges, or other relays not in the
networkstatus, on the map.
- Any relays that it doesn't yet have an IP-to-city answer for.
Other controllers can also use it to set EntryNodes, ExitNodes, etc
in a per-country way.
To support these features, we need to export the IP-to-country data
via the Tor controller protocol.
Is it sufficient just to add a new GETINFO command?
GETINFO ip-to-country/128.31.0.34
250+ip-to-country/128.31.0.34="US","USA","UNITED STATES"
[RESOLUTION: Not done now, except for the getinfo command.]
6.1. Other interfaces
Robert Hogan has also suggested a
GETINFO relays-by-country/cn
as well as torrc options for ExitCountryCodes, EntryCountryCodes,
ExcludeCountryCodes, etc.
[RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.]
7. Relays and bridges use the IP-to-country db for usage summaries
Once bridges have a GeoIP database locally, they can start to publish
sanitized summaries of client usage -- how many users they see and from
what countries. This might also be a more useful way for ordinary Tor
relays to convey the level of usage they see, which would allow us to
switch to using directory guards for all users by default.
But how to safely summarize this information without opening too many
anonymity leaks?
7.1 Attacks to think about
First, note that we need to have a large enough time window that we're
not aiding correlation attacks much. I hope 24 hours is enough. So
that means no publishing stats until you've been up at least 24 hours.
And you can't publish follow-up stats more often than every 24 hours,
or people could look at the differential.
Second, note that we need to be sufficiently vague about the IP
addresses we're reporting. We are hoping that just specifying the
country will be vague enough. But a) what about active attacks where
we convince a bridge to use a GeoIP db that labels each suspect IP
address as a unique country? We have to assume that the consensus GeoIP
db won't be malicious in this way. And b) could such singling-out
attacks occur naturally, for example because of countries that have
a very small IP space? We should investigate that.
7.2. Granularity of users
Do we only want to report countries that have a sufficient anonymity set
(that is, number of users) for the day? For example, we might avoid
listing any countries that have seen less than five addresses over
the 24 hour period. This approach would be helpful in reducing the
singling-out opportunities -- in the extreme case, we could imagine a
situation where one blogger from the Sudan used Tor on a given day, and
we can discover which entry guard she used.
But I fear that especially for bridges, seeing only one hit from a
given country in a given day may be quite common.
As a compromise, we should start out with an "Other" category in
the reported stats, which is the sum of unlisted countries; if that
category is consistently interesting, we can think harder about how
to get the right data from it safely.
But note that bridge summaries will not be made public individually,
since doing so would help people enumerate bridges. Whereas summaries
from normal relays will be public. So perhaps that means we can afford
to be more specific in bridge summaries? In particular, I'm thinking the
"other" category should be used by public relays but not for bridges
(or if it is, used with a lower threshold).
Even for countries that have many Tor users, we might not want to be
too specific about how many users we've seen. For example, we might
round down the number of users we report to the nearest multiple of 5.
My instinct for now is that this won't be that useful.
7.3 Other issues
Another note: we'll likely be overreporting in the case of users with
dynamic IP addresses: if they rotate to a new address over the course
of the day, we'll count them twice. So be it.
7.4. Where to publish the summaries?
We designed extrainfo documents for information like this. So they
should just be more entries in the extrainfo doc.
But if we want to publish summaries every 24 hours (no more often,
no less often), aren't we tried to the router descriptor publishing
schedule? That is, if we publish a new router descriptor at the 18
hour mark, and nothing much has changed at the 24 hour mark, won't
the new descriptor get dropped as being "cosmetically similar", and
then nobody will know to ask about the new extrainfo document?
One solution would be to make and remember the 24 hour summary at the
24 hour mark, but not actually publish it anywhere until we happen to
publish a new descriptor for other reasons. If we happen to go down
before publishing a new descriptor, then so be it, at least we tried.
7.5. What if the relay is unreachable or goes to sleep?
Even if you've been up for 24 hours, if you were hibernating for 18
of them, then we're not getting as much fuzziness as we'd like. So
I guess that means that we need a 24-hour period of being "awake"
before we'll willing to publish a summary. A similar attack works if
you've been awake but unreachable for the first 18 of the 24 hours. As
another example, a bridge that's on a laptop might be suspended for
some of each day.
This implies that some relays and bridges will never publish summary
stats, because they're not ever reliably working for 24 hours in
a row. If a significant percentage of our reporters end up being in
this boat, we should investigate whether we can accumulate 24 hours of
"usefulness", even if there are holes in the middle, and publish based
on that.
What other issues are like this? It seems that just moving to a new
IP address shouldn't be a reason to cancel stats publishing, assuming
we were usable at each address.
7.6. IP addresses that aren't in the geoip db
Some IP addresses aren't in the public geoip databases. In particular,
I've found that a lot of African countries are missing, but there
are also some common ones in the US that are missing, like parts of
Comcast. We could just lump unknown IP addresses into the "other"
category, but it might be useful to gather a general sense of how many
lookups are failing entirely, by adding a separate "Unknown" category.
We could also contribute back to the geoip db, by letting bridges set
a config option to report the actual IP addresses that failed their
lookup. Then the bridge authority operators can manually make sure
the correct answer will be in later geoip files. This config option
should be disabled by default.
7.7 Bringing it all together
So here's the plan:
24 hours after starting up (modulo Section 7.5 above), bridges and
relays should construct a daily summary of client countries they've
seen, including the above "Unknown" category (Section 7.6) as well.
Non-bridge relays lump all countries with less than K (e.g. K=5) users
into the "Other" category (see Sec 7.2 above), whereas bridge relays are
willing to list a country even when it has only one user for the day.
Whenever we have a daily summary on record, we include it in our
extrainfo document whenever we publish one. The daily summary we
remember locally gets replaced with a newer one when another 24
hours pass.
7.8. Some forward secrecy
How should we remember addresses locally? If we convert them into
country-codes immediately, we will count them again if we see them
again. On the other hand, we don't really want to keep a list hanging
around of all IP addresses we've seen in the past 24 hours.
Step one is that we should never write this stuff to disk. Keeping it
only in ram will make things somewhat better. Step two is to avoid
keeping any timestamps associated with it: rather than a rolling
24-hour window, which would require us to remember the various times
we've seen that address, we can instead just throw out the whole list
every 24 hours and start over.
We could hash the addresses, and then compare hashes when deciding if
we've seen a given address before. We could even do keyed hashes. Or
Bloom filters. But if our goal is to defend against an adversary
who steals a copy of our ram while we're running and then does
guess-and-check on whatever blob we're keeping, we're in bad shape.
We could drop the last octet of the IP address as soon as we see
it. That would cause us to undercount some users from cablemodem and
DSL networks that have a high density of Tor users. And it wouldn't
really help that much -- indeed, the extent to which it does help is
exactly the extent to which it makes our stats less useful.
Other ideas?

View File

@ -1,155 +0,0 @@
Filename: 127-dirport-mirrors-downloads.txt
Title: Relaying dirport requests to Tor download site / website
Author: Roger Dingledine
Created: 2007-12-02
Status: Draft
1. Overview
Some countries and networks block connections to the Tor website. As
time goes by, this will remain a problem and it may even become worse.
We have a big pile of mirrors (google for "Tor mirrors"), but few of
our users think to try a search like that. Also, many of these mirrors
might be automatically blocked since their pages contain words that
might cause them to get banned. And lastly, we can imagine a future
where the blockers are aware of the mirror list too.
Here we describe a new set of URLs for Tor's DirPort that will relay
connections from users to the official Tor download site. Rather than
trying to cache a bunch of new Tor packages (which is a hassle in terms
of keeping them up to date, and a hassle in terms of drive space used),
we instead just proxy the requests directly to Tor's /dist page.
Specifically, we should support
GET /tor/dist/$1
and
GET /tor/website/$1
2. Direct connections, one-hop circuits, or three-hop circuits?
We could relay the connections directly to the download site -- but
this produces recognizable outgoing traffic on the bridge or cache's
network, which will probably surprise our nice volunteers. (Is this
a good enough reason to discard the direct connection idea?)
Even if we don't do direct connections, should we do a one-hop
begindir-style connection to the mirror site (make a one-hop circuit
to it, then send a 'begindir' cell down the circuit), or should we do
a normal three-hop anonymized connection?
If these mirrors are mainly bridges, doing either a direct or a one-hop
connection creates another way to enumerate bridges. That would argue
for three-hop. On the other hand, downloading a 10+ megabyte installer
through a normal Tor circuit can't be fun. But if you're already getting
throttled a lot because you're in the "relayed traffic" bucket, you're
going to have to accept a slow transfer anyway. So three-hop it is.
Speaking of which, we would want to label this connection
as "relay" traffic for the purposes of rate limiting; see
connection_counts_as_relayed_traffic() and or_conn->client_used. This
will be a bit tricky though, because these connections will use the
bridge's guards.
3. Scanning resistance
One other goal we'd like to achieve, or at least not hinder, is making
it hard to scan large swaths of the Internet to look for responses
that indicate a bridge.
In general this is a really hard problem, so we shouldn't demand to
solve it here. But we can note that some bridges should open their
DirPort (and offer this functionality), and others shouldn't. Then
some bridges provide a download mirror while others can remain
scanning-resistant.
4. Integrity checking
If we serve this stuff in plaintext from the bridge, anybody in between
the user and the bridge can intercept and modify it. The bridge can too.
If we do an anonymized three-hop connection, the exit node can also
intercept and modify the exe it sends back.
Are we setting ourselves up for rogue exit relays, or rogue bridges,
that trojan our users?
Answer #1: Users need to do pgp signature checking. Not a very good
answer, a) because it's complex, and b) because they don't know the
right signing keys in the first place.
Answer #2: The mirrors could exit from a specific Tor relay, using the
'.exit' notation. This would make connections a bit more brittle, but
would resolve the rogue exit relay issue. We could even round-robin
among several, and the list could be dynamic -- for example, all the
relays with an Authority flag that allow exits to the Tor website.
Answer #3: The mirrors should connect to the main distribution site
via SSL. That way the exit relay can't influence anything.
Answer #4: We could suggest that users only use trusted bridges for
fetching a copy of Tor. Hopefully they heard about the bridge from a
trusted source rather than from the adversary.
Answer #5: What if the adversary is trawling for Tor downloads by
network signature -- either by looking for known bytes in the binary,
or by looking for "GET /tor/dist/"? It would be nice to encrypt the
connection from the bridge user to the bridge. And we can! The bridge
already supports TLS. Rather than initiating a TLS renegotiation after
connecting to the ORPort, the user should actually request a URL. Then
the ORPort can either pass the connection off as a linked conn to the
dirport, or renegotiate and become a Tor connection, depending on how
the client behaves.
5. Linked connections: at what level should we proxy?
Check out the connection_ap_make_link() function, as called from
directory.c. Tor clients use this to create a "fake" socks connection
back to themselves, and then they attach a directory request to it,
so they can launch directory fetches via Tor. We can piggyback on
this feature.
We need to decide if we're going to be passing the bytes back and
forth between the web browser and the main distribution site, or if
we're going to be actually acting like a proxy (parsing out the file
they want, fetching that file, and serving it back).
Advantages of proxying without looking inside:
- We don't need to build any sort of http support (including
continues, partial fetches, etc etc).
Disadvantages:
- If the browser thinks it's speaking http, are there easy ways
to pass the bytes to an https server and have everything work
correctly? At the least, it would seem that the browser would
complain about the cert. More generally, ssl wants to be negotiated
before the URL and headers are sent, yet we need to read the URL
and headers to know that this is a mirror request; so we have an
ordering problem here.
- Makes it harder to do caching later on, if we don't look at what
we're relaying. (It might be useful down the road to cache the
answers to popular requests, so we don't have to keep getting
them again.)
6. Outstanding problems
1) HTTP proxies already exist. Why waste our time cloning one
badly? When we clone existing stuff, we usually regret it.
2) It's overbroad. We only seem to need a secure get-a-tor feature,
and instead we're contemplating building a locked-down HTTP proxy.
3) It's going to add a fair bit of complexity to our code. We do
not currently implement HTTPS. We'd need to refactor lots of the
low-level connection stuff so that "SSL" and "Cell-based" were no
longer synonymous.
4) It's still unclear how effective this proposal would be in
practice. You need to know that this feature exists, which means
somebody needs to tell you about a bridge (mirror) address and tell
you how to use it. And if they're doing that, they could (e.g.) tell
you about a gmail autoresponder address just as easily, and then you'd
get better authentication of the Tor program to boot.

View File

@ -1,64 +0,0 @@
Filename: 128-bridge-families.txt
Title: Families of private bridges
Author: Roger Dingledine
Created: 2007-12-xx
Status: Dead
1. Overview
Proposal 125 introduced the basic notion of how bridge authorities,
bridge relays, and bridge users should behave. But it doesn't get into
the various mechanisms of how to distribute bridge relay addresses to
bridge users.
One of the mechanisms we have in mind is called 'families of bridges'.
If a bridge user knows about only one private bridge, and that bridge
shuts off for the night or gets a new dynamic IP address, the bridge
user is out of luck and needs to re-bootstrap manually or wait and
hope it comes back. On the other hand, if the bridge user knows about
a family of bridges, then as long as one of those bridges is still
reachable his Tor client can automatically learn about where the
other bridges have gone.
So in this design, a single volunteer could run multiple coordinated
bridges, or a group of volunteers could each run a bridge. We abstract
out the details of how these volunteers find each other and decide to
set up a family.
2. Other notes.
somebody needs to run a bridge authority
it needs to have a torrc option to publish networkstatuses of its bridges
it should also do reachability testing just of those bridges
people ask for the bridge networkstatus by asking for a url that
contains a password. (it's safe to do this because of begin_dir.)
so the bridge users need to know a) a password, and b) a bridge
authority line.
the bridge users need to know the bridge authority line.
the bridge authority needs to know the password.
3. Current state
I implemented a BridgePassword config option. Bridge authorities
should set it, and users who want to use those bridge authorities
should set it.
Now there is a new directory URL "/tor/networkstatus-bridges" that
directory mirrors serve if BridgeAuthoritativeDir is set and it's a
begin_dir connection. It looks for the header
Authorization: Basic %s
where %s is the base-64 bridge password.
I never got around to teaching clients how to set the header though,
so it may or may not, and may or may not do what we ultimate want.
I've marked this proposal dead; it really never should have left the
ideas/ directory. Somebody should pick it up sometime and finish the
design and implementation.

View File

@ -1,114 +0,0 @@
Filename: 129-reject-plaintext-ports.txt
Title: Block Insecure Protocols by Default
Author: Kevin Bauer & Damon McCoy
Created: 2008-01-15
Status: Closed
Implemented-In: 0.2.0.x
Overview:
Below is a proposal to mitigate insecure protocol use over Tor.
This document 1) demonstrates the extent to which insecure protocols are
currently used within the Tor network, and 2) proposes a simple solution
to prevent users from unknowingly using these insecure protocols. By
insecure, we consider protocols that explicitly leak sensitive user names
and/or passwords, such as POP, IMAP, Telnet, and FTP.
Motivation:
As part of a general study of Tor use in 2006/2007 [1], we attempted to
understand what types of protocols are used over Tor. While we observed a
enormous volume of Web and Peer-to-peer traffic, we were surprised by the
number of insecure protocols that were used over Tor. For example, over an
8 day observation period, we observed the following number of connections
over insecure protocols:
POP and IMAP:10,326 connections
Telnet: 8,401 connections
FTP: 3,788 connections
Each of the above listed protocols exchange user name and password
information in plain-text. As an upper bound, we could have observed
22,515 user names and passwords. This observation echos the reports of
a Tor router logging and posting e-mail passwords in August 2007 [2]. The
response from the Tor community has been to further educate users
about the dangers of using insecure protocols over Tor. However, we
recently repeated our Tor usage study from last year and noticed that the
trend in insecure protocol use has not declined. Therefore, we propose that
additional steps be taken to protect naive Tor users from inadvertently
exposing their identities (and even passwords) over Tor.
Security Implications:
This proposal is intended to improve Tor's security by limiting the
use of insecure protocols.
Roger added: By adding these warnings for only some of the risky
behavior, users may do other risky behavior, not get a warning, and
believe that it is therefore safe. But overall, I think it's better
to warn for some of it than to warn for none of it.
Specification:
As an initial step towards mitigating the use of the above-mentioned
insecure protocols, we propose that the default ports for each respective
insecure service be blocked at the Tor client's socks proxy. These default
ports include:
23 - Telnet
109 - POP2
110 - POP3
143 - IMAP
Notice that FTP is not included in the proposed list of ports to block. This
is because FTP is often used anonymously, i.e., without any identifying
user name or password.
This blocking scheme can be implemented as a set of flags in the client's
torrc configuration file:
BlockInsecureProtocols 0|1
WarnInsecureProtocols 0|1
When the warning flag is activated, a message should be displayed to
the user similar to the message given when Tor's socks proxy is given an IP
address rather than resolving a host name.
We recommend that the default torrc configuration file block insecure
protocols and provide a warning to the user to explain the behavior.
Finally, there are many popular web pages that do not offer secure
login features, such as MySpace, and it would be prudent to provide
additional rules to Privoxy to attempt to protect users from unknowingly
submitting their login credentials in plain-text.
Compatibility:
None, as the proposed changes are to be implemented in the client.
References:
[1] Shining Light in Dark Places: A Study of Anonymous Network Usage.
University of Colorado Technical Report CU-CS-1032-07. August 2007.
[2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise.
http://www.wired.com/politics/security/news/2007/09/embassy_hacks.
Wired. September 10, 2007.
Implementation:
Roger added this feature in
http://archives.seul.org/or/cvs/Jan-2008/msg00182.html
He also added a status event for Vidalia to recognize attempts to use
vulnerable-plaintext ports, so it can help the user understand what's
going on and how to fix it.
Next steps:
a) Vidalia should learn to recognize this controller status event,
so we don't leave users out in the cold when we enable this feature.
b) We should decide which ports to reject by default. The current
consensus is 23,109,110,143 -- the same set that we warn for now.

View File

@ -1,184 +0,0 @@
Filename: 130-v2-conn-protocol.txt
Title: Version 2 Tor connection protocol
Author: Nick Mathewson
Created: 2007-10-25
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This proposal describes the significant changes to be made in the v2
Tor connection protocol.
This proposal relates to other proposals as follows:
It refers to and supersedes:
Proposal 124: Blocking resistant TLS certificate usage
It refers to aspects of:
Proposal 105: Version negotiation for the Tor protocol
In summary, The Tor connection protocol has been in need of a redesign
for a while. This proposal describes how we can add to the Tor
protocol:
- A new TLS handshake (to achieve blocking resistance without
breaking backward compatibility)
- Version negotiation (so that future connection protocol changes
can happen without breaking compatibility)
- The actual changes in the v2 Tor connection protocol.
Motivation:
For motivation, see proposal 124.
Proposal:
0. Terminology
The version of the Tor connection protocol implemented up to now is
"version 1". This proposal describes "version 2".
"Old" or "Older" versions of Tor are ones not aware that version 2
of this protocol exists;
"New" or "Newer" versions are ones that are.
The connection initiator is referred to below as the Client; the
connection responder is referred to below as the Server.
1. The revised TLS handshake.
For motivation, see proposal 124. This is a simplified version of the
handshake that uses TLS's renegotiation capability in order to avoid
some of the extraneous steps in proposal 124.
The Client connects to the Server and, as in ordinary TLS, sends a
list of ciphers. Older versions of Tor will send only ciphers from
the list:
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
Clients that support the revised handshake will send the recommended
list of ciphers from proposal 124, in order to emulate the behavior of
a web browser.
If the server notices that the list of ciphers contains only ciphers
from this list, it proceeds with Tor's version 1 TLS handshake as
documented in tor-spec.txt.
(The server may also notice cipher lists used by other implementations
of the Tor protocol (in particular, the BouncyCastle default cipher
list as used by some Java-based implementations), and whitelist them.)
On the other hand, if the server sees a list of ciphers that could not
have been sent from an older implementation (because it includes other
ciphers, and does not match any known-old list), the server sends a
reply containing a single connection certificate, constructed as for
the link certificate in the v1 Tor protocol. The subject names in
this certificate SHOULD NOT have any strings to identify them as
coming from a Tor server. The server does not ask the client for
certificates.
Old Servers will (mostly) ignore the cipher list and respond as in the v1
protocol, sending back a two-certificate chain.
After the Client gets a response from the server, it checks for the
number of certificates it received. If there are two certificates,
the client assumes a V1 connection and proceeds as in tor-spec.txt.
But if there is only one certificate, the client assumes a V2 or later
protocol and continues.
At this point, the client has established a TLS connection with the
server, but the parties have not been authenticated: the server hasn't
sent its identity certificate, and the client hasn't sent any
certificates at all. To fix this, the client begins a TLS session
renegotiation. This time, the server continues with two certificates
as usual, and asks for certificates so that the client will send
certificates of its own. Because the TLS connection has been
established, all of this is encrypted. (The certificate sent by the
server in the renegotiated connection need not be the same that
as sentin the original connection.)
The server MUST NOT write any data until the client has renegotiated.
Once the renegotiation is finished, the server and client check one
another's certificates as in V1. Now they are mutually authenticated.
1.1. Revised TLS handshake: implementation notes.
It isn't so easy to adjust server behavior based on the client's
ciphersuite list. Here's how we can do it using OpenSSL. This is a
bit of an abuse of the OpenSSL APIs, but it's the best we can do, and
we won't have to do it forever.
We can use OpenSSL's SSL_set_info_callback() to register a function to
be called when the state changes. The type/state tuple of
SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A
happens when we have completely parsed the client hello, and are about
to send a response. From this callback, we can check the cipherlist
and act accordingly:
* If the ciphersuite list indicates a v1 protocol, we set the
verify mode to SSL_VERIFY_NONE with a callback (so we get
certificates).
* If the ciphersuite list indicates a v2 protocol, we set the
verify mode to SSL_VERIFY_NONE with no callback (so we get
no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that
we send only 1 certificate in the response.
Once the handshake is done, the server clears the
SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1
protocol. It then starts reading.
The other problem to take care of is missing ciphers and OpenSSL's
cipher sorting algorithms. The two main issues are a) OpenSSL doesn't
support some of the default ciphers that Firefox advertises, and b)
OpenSSL sorts the list of ciphers it offers in a different way than
Firefox sorts them, so unless we fix that Tor will still look different
than Firefox.
[XXXX more on this.]
1.2. Compatibility for clients using libraries less hackable than OpenSSL.
As discussed in proposal 105, servers advertise which protocol
versions they support in their router descriptors. Clients can simply
behave as v1 clients when connecting to servers that do not support
link version 2 or higher, and as v2 clients when connecting to servers
that do support link version 2 or higher.
(Servers can't use this strategy because we do not assume that servers
know one another's capabilities when connecting.)
2. Version negotiation.
Version negotiation proceeds as described in proposal 105, except as
follows:
* Version negotiation only happens if the TLS handshake as described
above completes.
* The TLS renegotiation must be finished before the client sends a
VERSIONS cell; the server sends its VERSIONS cell in response.
* The VERSIONS cell uses the following variable-width format:
Circuit [2 octets; set to 0]
Command [1 octet; set to 7 for VERSIONS]
Length [2 octets; big-endian]
Data [Length bytes]
The Data in the cell is a series of big-endian two-byte integers.
* It is not allowed to negotiate V1 conections once the v2 protocol
has been used. If this happens, Tor instances should close the
connection.
3. The rest of the "v2" protocol
Once a v2 protocol has been negotiated, NETINFO cells are exchanged
as in proposal 105, and communications begin as per tor-spec.txt.
Until NETINFO cells have been exchanged, the connection is not open.

View File

@ -1,148 +0,0 @@
Filename: 131-verify-tor-usage.txt
Title: Help users to verify they are using Tor
Author: Steven J. Murdoch
Created: 2008-01-25
Status: Needs-Revision
Overview:
Websites for checking whether a user is accessing them via Tor are a
very helpful aid to configuring web browsers correctly. Existing
solutions have both false positives and false negatives when
checking if Tor is being used. This proposal will discuss how to
modify Tor so as to make testing more reliable.
Motivation:
Currently deployed websites for detecting Tor use work by comparing
the client IP address for a request with a list of known Tor nodes.
This approach is generally effective, but suffers from both false
positives and false negatives.
If a user has a Tor exit node installed, or just happens to have
been allocated an IP address previously used by a Tor exit node, any
web requests will be incorrectly flagged as coming from Tor. If any
customer of an ISP which implements a transparent proxy runs an exit
node, all other users of the ISP will be flagged as Tor users.
Conversely, if the exit node chosen by a Tor user has not yet been
recorded by the Tor checking website, requests will be incorrectly
flagged as not coming via Tor.
The only reliable way to tell whether Tor is being used or not is for
the Tor client to flag this to the browser.
Proposal:
A DNS name should be registered and point to an IP address
controlled by the Tor project and likely to remain so for the
useful lifetime of a Tor client. A web server should be placed
at this IP address.
Tor should be modified to treat requests to port 80, at the
specified DNS name or IP address specially. Instead of opening a
circuit, it should respond to a HTTP request with a helpful web
page:
- If the request to open a connection was to the domain name, the web
page should state that Tor is working properly.
- If the request was to the IP address, the web page should state
that there is a DNS-leakage vulnerability.
If the request goes through to the real web server, the page
should state that Tor has not been set up properly.
Extensions:
Identifying proxy server:
If needed, other applications between the web browser and Tor (e.g.
Polipo and Privoxy) could piggyback on the same mechanism to flag
whether they are in use. All three possible web pages should include
a machine-readable placeholder, into which another program could
insert their own message.
For example, the webpage returned by Tor to indicate a successful
configuration could include the following HTML:
<h2>Connection chain</h2>
<ul>
<li>Tor 0.1.2.14-alpha</li>
<!-- Tor Connectivity Check: success -->
</ul>
When the proxy server observes this string, in response to a request
for the Tor connectivity check web page, it would prepend it's own
message, resulting in the following being returned to the web
browser:
<h2>Connection chain
<ul>
<li>Tor 0.1.2.14-alpha</li>
<li>Polipo version 1.0.4</li>
<!-- Tor Connectivity Check: success -->
</ul>
Checking external connectivity:
If Tor intercepts a request, and returns a response itself, the user
will not actually confirm whether Tor is able to build a successful
circuit. It may then be advantageous to include an image in the web
page which is loaded from a different domain. If this is able to be
loaded then the user will know that external connectivity through
Tor works.
Automatic Firefox Notification:
All forms of the website should return valid XHTML and have a
hidden link with an id attribute "TorCheckResult" and a target
property that can be queried to determine the result. For example,
a hidden link would convey success like this:
<a id="TorCheckResult" target="success" href="/"></a>
failure like this:
<a id="TorCheckResult" target="failure" href="/"></a>
and DNS leaks like this:
<a id="TorCheckResult" target="dnsleak" href="/"></a>
Firefox extensions such as Torbutton would then be able to
issue an XMLHttpRequest for the page and query the result
with resultXML.getElementById("TorCheckResult").target
to automatically report the Tor status to the user when
they first attempt to enable Tor activity, or whenever
they request a check from the extension preferences window.
If the check website is to be themed with heavy graphics and/or
extensive documentation, the check result itself should be
contained in a seperate lightweight iframe that extensions can
request via an alternate url.
Security and resiliency implications:
What attacks are possible?
If the IP address used for this feature moves there will be two
consequences:
- A new website at this IP address will remain inaccessible over
Tor
- Tor users who are leaking DNS will be informed that Tor is not
working, rather than that it is active but leaking DNS
We should thus attempt to find an IP address which we reasonably
believe can remain static.
Open issues:
If a Tor version which does not support this extra feature is used,
the webpage returned will indicate that Tor is not being used. Can
this be safely fixed?
Related work:
The proposed mechanism is very similar to config.privoxy.org. The
most significant difference is that if the web browser is
misconfigured, Tor will only get an IP address. Even in this case,
Tor should be able to respond with a webpage to notify the user of how
to fix the problem. This also implies that Tor must be told of the
special IP address, and so must be effectively permanent.

View File

@ -1,145 +0,0 @@
Filename: 132-browser-check-tor-service.txt
Title: A Tor Web Service For Verifying Correct Browser Configuration
Author: Robert Hogan
Created: 2008-03-08
Status: Draft
Overview:
Tor should operate a primitive web service on the loopback network device
that tests the operation of user's browser, privacy proxy and Tor client.
The tests are performed by serving unique, randomly generated elements in
image URLs embedded in static HTML. The images are only displayed if the DNS
and HTTP requests for them are routed through Tor, otherwise the 'alt' text
may be displayed. The proposal assumes that 'alt' text is not displayed on
all browsers so suggests that text and links should accompany each image
advising the user on next steps in case the test fails.
The service is primarily for the use of controllers, since presumably users
aren't going to want to edit text files and then type something exotic like
127.0.0.1:9999 into their address bar. In the main use case the controller
will have configured the actual port for the webservice so will know where
to direct the request. It would also be the responsibility of the controller
to ensure the webservice is available, and tor is running, before allowing
the user to access the page through their browser.
Motivation:
This is a complementary approach to proposal 131. It overcomes some of the
limitations of the approach described in proposal 131: reliance
on a permanent, real IP address and compatibility with older versions of
Tor. Unlike 131, it is not as useful to Tor users who are not running a
controller.
Objective:
Provide a reliable means of helping users to determine if their Tor
installation, privacy proxy and browser are properly configured for
anonymous browsing.
Proposal:
When configured to do so, Tor should run a basic web service available
on a configured port on 127.0.0.1. The purpose of this web service is to
serve a number of basic test images that will allow the user to determine
if their browser is properly configured and that Tor is working normally.
The service can consist of a single web page with two columns. The left
column contains images, the right column contains advice on what the
display/non-display of the column means.
The rest of this proposal assumes that the service is running on port
9999. The port should be configurable, and configuring the port enables the
service. The service must run on 127.0.0.1.
In all the examples below [uniquesessionid] refers to a random, base64
encoded string that is unique to the URL it is contained in. Tor only ever
stores the most recently generated [uniquesessionid] for each URL, storing 3
in total. Tor should generate a [uniquesessionid] for each of the test URLs
below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm.
The most suitable image for each test case is an implementation decision.
Tor will need to store and serve images for the first and second test
images, and possibly the third (see 'Open Issues').
1. DNS Request Test Image
This is a HTML element embedded in the page served by Tor at
http://127.0.0.1:9999:
<IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see
this text, your browser's DNS requests are not being routed through Tor."
width="200" height="200" align="middle" border="2">
If the browser's DNS request for [uniquesessionid] is routed through Tor,
Tor will intercept the request and return 127.0.0.1 as the resolved IP
address. This will shortly be followed by a HTTP request from the browser
for http://127.0.0.1:9999/torlogo.jpg. This request should be served with
the appropriate image.
If the browser's DNS request for [uniquesessionid] is not routed through Tor
the browser may display the 'alt' text specified in the html element. The
HTML served by Tor should also contain text accompanying the image to advise
users what it means if they do not see an image. It should also provide a
link to click that provides information on how to remedy the problem. This
behaviour also applies to the images described in 2. and 3. below, so should
be assumed there as well.
2. Proxy Configuration Test Image
This is a HTML element embedded in the page served by Tor at
http://127.0.0.1:9999:
<IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see
this text, your browser is not configured to work with Tor." width="200"
height="200" align="middle" border="2">
If the HTTP request for the resource [uniquesessionid].jpg is received by
Tor it will serve the appropriate image in response. It should serve this
image itself, without attempting to retrieve anything from the Internet.
If Tor can identify the name of the proxy application requesting the
resource then it could store and serve an image identifying the proxy to the
user.
3. Tor Connectivity Test Image
This is a HTML element embedded in the page served by Tor at
http://127.0.0.1:9999:
<IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you
can see this text, your Tor installation cannot connect to the Internet."
width="200" height="200" align="middle" border="2">
The referenced image should actually exist on the Tor project website. If
Tor receives the request for the above resource it should remove the random
base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt
to retrieve the real image.
Even on a fully operational Tor client this test may not always succeed. The
user should be advised that one or more attempts to retrieve this image may
be necessary to confirm a genuine problem.
Open Issues:
The final connectivity test relies on an externally maintained resource, if
this resource becomes unavailable the connectivity test will always fail.
Either the text accompanying the test should advise of this possibility or
Tor clients should be advised of the location of the test resource in the
main network directory listings.
Any number of misconfigurations may make the web service unreachable, it is
the responsibility of the user's controller to recognize these and assist
the user in eliminating them. Tor can mitigate against the specific
misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by
serving such requests through the SOCKS port as well as the configured web
service report.
Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping'
them. It already inspects for raw IP addresses (to warn of DNS leaks) but
maybe the behaviour proposed here is qualitatively different. Maybe this is
an unwelcome precedent that can be used to beat the project over the head in
future. Or maybe it's not such a bad thing, Tor is merely attempting to make
normally invalid resource requests valid for a given purpose.

View File

@ -1,128 +0,0 @@
Filename: 133-unreachable-ors.txt
Title: Incorporate Unreachable ORs into the Tor Network
Author: Robert Hogan
Created: 2008-03-08
Status: Draft
Overview:
Propose a scheme for harnessing the bandwidth of ORs who cannot currently
participate in the Tor network because they can only make outbound
TCP connections.
Motivation:
Restrictive local and remote firewalls are preventing many willing
candidates from becoming ORs on the Tor network.These
ORs have a casual interest in joining the network but their operator is not
sufficiently motivated or adept to complete the necessary router or firewall
configuration. The Tor network is losing out on their bandwidth. At the
moment we don't even know how many such 'candidate' ORs there are.
Objective:
1. Establish how many ORs are unable to qualify for publication because
they cannot establish that their ORPort is reachable.
2. Devise a method for making such ORs available to clients for circuit
building without prejudicing their anonymity.
Proposal:
ORs whose ORPort reachability testing fails a specified number of
consecutive times should:
1. Enlist themselves with the authorities setting a 'Fallback' flag. This
flag indicates that the OR is up and running but cannot connect to
itself.
2. Open an orconn with all ORs whose fingerprint begins with the same
byte as their own. The management of this orconn will be transferred
entirely to the OR at the other end.
2. The fallback OR should update it's router status to contain the
'Running' flag if it has managed to open an orconn with 3/4 of the ORs
with an FP beginning with the same byte as its own.
Tor ORs who are contacted by fallback ORs requesting an orconn should:
1. Accept the orconn until they have reached a defined limit of orconn
connections with fallback ORs.
2. Should only accept such orconn requests from listed fallback ORs who
have an FP beginning with the same byte as its own.
Tor clients can include fallback ORs in the network by doing the
following:
1. When building a circuit, observe the fingerprint of each node they
wish to connect to.
2. When randomly selecting a node from the set of all eligible nodes,
add all published, running fallback nodes to the set where the first
byte of the fingerprint matches the previous node in the circuit.
Anonymity Implications:
At least some, and possibly all, nodes on the network will have a set
of nodes that only they and a few others can build circuits on.
1. This means that fallback ORs might be unsuitable for use as middlemen
nodes, because if the exit node is the attacker it knows that the
number of nodes that could be the entry guard in the circuit is
reduced to roughly 1/256th of the network, or worse 1/256th of all
nodes listed as Guards. For the same reason, fallback nodes would
appear to be unsuitable for two-hop circuits.
2. This is not a problem if fallback ORs are always exit nodes. If
the fallback OR is an attacker it will not be able to reduce the
set of possible nodes for the entry guard any further than a normal,
published OR.
Possible Attacks/Open Issues:
1. Gaming Node Selection
Does running a fallback OR customized for a specific set of published ORs
improve an attacker's chances of seeing traffic from that set of published
ORs? Would such a strategy be any more effective than running published
ORs with other 'attractive' properties?
2. DOS Attack
An attacker could prevent all other legitimate fallback ORs with a
given byte-1 in their FP from functioning by running 20 or 30 fallback ORs
and monopolizing all available fallback slots on the published ORs.
This same attacker would then be in a position to monopolize all the
traffic of the fallback ORs on that byte-1 network segment. I'm not sure
what this would allow such an attacker to do.
4. Circuit-Sniffing
An observer watching exit traffic from a fallback server will know that the
previous node in the circuit is one of a very small, identifiable
subset of the total ORs in the network. To establish the full path of the
circuit they would only have to watch the exit traffic from the fallback
OR and all the traffic from the 20 or 30 ORs it is likely to be connected
to. This means it is substantially easier to establish all members of a
circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e.
1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560
or so ORs on the network). The same mechanism that allows the client to
expect a specific fallback OR to be available from a specific published OR
allows an attacker to prepare his ground.
Mitigant:
In terms of the resources and access required to monitor 2000 to 3000
nodes, the effort of the adversary is not significantly diminished when he
is only interested in 20 or 30. It is hard to see how an adversary who can
obtain access to a randomly selected portion of the Tor network would face
any new or qualitatively different obstacles in attempting to access much
of the rest of it.
Implementation Issues:
The number of ORs this proposal would add to the Tor network is not known.
This is because there is no mechanism at present for recording unsuccessful
attempts to become an OR. If the proposal is considered promising it may be
worthwhile to issue an alpha series release where candidate ORs post a
primitive fallback descriptor to the authority directories. This fallback
descriptor would not contain any other flag that would make it eligible for
selection by clients. It would act solely as a means of sizing the number of
Tor instances that try and fail to become ORs.
The upper limit on the number of orconns from fallback ORs a normal,
published OR should be willing to accept is an open question. Is one
hundred, mostly idle, such orconns too onerous?

View File

@ -1,123 +0,0 @@
Filename: 134-robust-voting.txt
Title: More robust consensus voting with diverse authority sets
Author: Peter Palfrader
Created: 2008-04-01
Status: Rejected
History:
2009 May 27: Added note on rejecting this proposal -- Nick
Overview:
A means to arrive at a valid directory consensus even when voters
disagree on who is an authority.
Motivation:
Right now there are about five authoritative directory servers in the
Tor network, tho this number is expected to rise to about 15 eventually.
Adding a new authority requires synchronized action from all operators of
directory authorities so that at any time during the update at least half of
all authorities are running and agree on who is an authority. The latter
requirement is there so that the authorities can arrive at a common
consensus: Each authority builds the consensus based on the votes from
all authorities it recognizes, and so a different set of recognized
authorities will lead to a different consensus document.
Objective:
The modified voting procedure outlined in this proposal obsoletes the
requirement for most authorities to exactly agree on the list of
authorities.
Proposal:
The vote document each authority generates contains a list of
authorities recognized by the generating authority. This will be
a list of authority identity fingerprints.
Authorities will accept votes from and serve/mirror votes also for
authorities they do not recognize. (Votes contain the signing,
authority key, and the certificate linking them so they can be
verified even without knowing the authority beforehand.)
Before building the consensus we will check which votes to use for
building:
1) We build a directed graph of which authority/vote recognizes
whom.
2) (Parts of the graph that aren't reachable, directly or
indirectly, from any authorities we recognize can be discarded
immediately.)
3) We find the largest fully connected subgraph.
(Should there be more than one subgraph of the same size there
needs to be some arbitrary ordering so we always pick the same.
E.g. pick the one who has the smaller (XOR of all votes' digests)
or something.)
4) If we are part of that subgraph, great. This is the list of
votes we build our consensus with.
5) If we are not part of that subgraph, remove all the nodes that
are part of it and go to 3.
Using this procedure authorities that are updated to recognize a
new authority will continue voting with the old group until a
sufficient number has been updated to arrive at a consensus with
the recently added authority.
In fact, the old set of authorities will probably be voting among
themselves until all but one has been updated to recognize the
new authority. Then which set of votes is used for consensus
building depends on which of the two equally large sets gets
ordered before the other in step (3) above.
It is necessary to continue with the process in (5) even if we
are not in the largest subgraph. Otherwise one rogue authority
could create a number of extra votes (by new authorities) so that
everybody stops at 5 and no consensus is built, even tho it would
be trusted by all clients.
Anonymity Implications:
The author does not believe this proposal to have anonymity
implications.
Possible Attacks/Open Issues/Some thinking required:
Q: Can a number (less or exactly half) of the authorities cause an honest
authority to vote for "their" consensus rather than the one that would
result were all authorities taken into account?
Q: Can a set of votes from external authorities, i.e of whom we trust either
none or at least not all, cause us to change the set of consensus makers we
pick?
A: Yes, if other authorities decide they rather build a consensus with them
then they'll be thrown out in step 3. But that's ok since those other
authorities will never vote with us anyway.
If we trust none of them then we throw them out even sooner, so no harm done.
Q: Can this ever force us to build a consensus with authorities we do not
recognize?
A: No, we can never build a fully connected set with them in step 3.
------------------------------
I'm rejecting this proposal as insecure.
Suppose that we have a clique of size N, and M hostile members in the
clique. If these hostile members stop declaring trust for up to M-1
good members of the clique, the clique with the hostile members will
in it will be larger than the one without them.
The M hostile members will constitute a majority of this new clique
when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our
requirement that an adversary must compromise a majority of authorities
in order to control the consensus.
-- Nick

View File

@ -1,281 +0,0 @@
Filename: 135-private-tor-networks.txt
Title: Simplify Configuration of Private Tor Networks
Author: Karsten Loesing
Created: 29-Apr-2008
Status: Closed
Target: 0.2.1.x
Implemented-In: 0.2.1.2-alpha
Change history:
29-Apr-2008 Initial proposal for or-dev
19-May-2008 Included changes based on comments by Nick to or-dev and
added a section for test cases.
18-Jun-2008 Changed testing-network-only configuration option names.
Overview:
Configuring a private Tor network has become a time-consuming and
error-prone task with the introduction of the v3 directory protocol. In
addition to that, operators of private Tor networks need to set an
increasing number of non-trivial configuration options, and it is hard
to keep FAQ entries describing this task up-to-date. In this proposal we
(1) suggest to (optionally) accelerate timing of the v3 directory voting
process and (2) introduce an umbrella config option specifically aimed at
creating private Tor networks.
Design:
1. Accelerate Timing of v3 Directory Voting Process
Tor has reasonable defaults for setting up a large, Internet-scale
network with comparably high latencies and possibly wrong server clocks.
However, those defaults are bad when it comes to quickly setting up a
private Tor network for testing, either on a single node or LAN (things
might be different when creating a test network on PlanetLab or
something). Some time constraints should be made configurable for private
networks. The general idea is to accelerate everything that has to do
with propagation of directory information, but nothing else, so that a
private network is available as soon as possible. (As a possible
safeguard, changing these configuration values could be made dependent on
the umbrella configuration option introduced in 2.)
1.1. Initial Voting Schedule
When a v3 directory does not know any consensus, it assumes an initial,
hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and
DistDelay of 5 minutes. This is important for multiple, simultaneously
restarted directory authorities to meet at a common time and create an
initial consensus. Unfortunately, this means that it may take up to half
an hour (or even more) for a private Tor network to bootstrap.
We propose to make these three time constants configurable (note that
V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an
effect on the _initial_ voting schedule, but only on the schedule that a
directory authority votes for). This can be achieved by introducing three
new configuration options: TestingV3AuthInitialVotingInterval,
TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay.
As first safeguards, Tor should only accept configuration values for
TestingV3AuthInitialVotingInterval that divide evenly into the default
value of 30 minutes. The effect is that even if people misconfigured
their directory authorities, they would meet at the default values at the
latest. The second safeguard is to allow configuration only when the
umbrella configuration option TestingTorNetwork is set.
1.2. Immediately Provide Reachability Information (Running flag)
The default behavior of a directory authority is to provide the Running
flag only after the authority is available for at least 30 minutes. The
rationale is that before that time, an authority simply cannot deliver
useful information about other running nodes. But for private Tor
networks this may be different. This is currently implemented in the code
as:
/** If we've been around for less than this amount of time, our
* reachability information is not accurate. */
#define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60)
There should be another configuration option
TestingAuthDirTimeToLearnReachability with a default value of 30 minutes
that can be changed when running testing Tor networks, e.g. to 0 minutes.
The configuration value would simply replace the quoted constant. Again,
changing this option could be safeguarded by requiring the umbrella
configuration option TestingTorNetwork to be set.
1.3. Reduce Estimated Descriptor Propagation Time
Tor currently assumes that it takes up to 10 minutes until router
descriptors are propagated from the authorities to directory caches.
This is not very useful for private Tor networks, and we want to be able
to reduce this time, so that clients can download router descriptors in a
timely manner.
/** Clients don't download any descriptor this recent, since it will
* probably not have propagated to enough caches. */
#define ESTIMATED_PROPAGATION_TIME (10*60)
We suggest to introduce a new config option
TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes,
but that can be set to any lower non-negative value, e.g. 0 minutes. The
same safeguards as in 1.2 could be used here, too.
2. Umbrella Option for Setting Up Private Tor Networks
Setting up a private Tor network requires a number of specific settings
that are not required or useful when running Tor in the public Tor
network. Instead of writing down these options in a FAQ entry, there
should be a single configuration option, e.g. TestingTorNetwork, that
changes all required settings at once. Newer Tor versions would keep the
set of configuration options up-to-date. It should still remain possible
to manually overwrite the settings that the umbrella configuration option
affects.
The following configuration options are set by TestingTorNetwork:
- ServerDNSAllowBrokenResolvConf 1
Ignore the situation that private relays are not aware of any name
servers.
- DirAllowPrivateAddresses 1
Allow router descriptors containing private IP addresses.
- EnforceDistinctSubnets 0
Permit building circuits with relays in the same subnet.
- AssumeReachable 1
Omit self-testing for reachability.
- AuthDirMaxServersPerAddr 0
- AuthDirMaxServersPerAuthAddr 0
Permit an unlimited number of nodes on the same IP address.
- ClientDNSRejectInternalAddresses 0
Believe in DNS responses resolving to private IP addresses.
- ExitPolicyRejectPrivate 0
Allow exiting to private IP addresses. (This one is a matter of
taste---it might be dangerous to make this a default in a private
network, although people setting up private Tor networks should know
what they are doing.)
- V3AuthVotingInterval 5 minutes
- V3AuthVoteDelay 20 seconds
- V3AuthDistDelay 20 seconds
Accelerate voting schedule after first consensus has been reached.
- TestingV3AuthInitialVotingInterval 5 minutes
- TestingV3AuthInitialVoteDelay 20 seconds
- TestingV3AuthInitialDistDelay 20 seconds
Accelerate initial voting schedule until first consensus is reached.
- TestingAuthDirTimeToLearnReachability 0 minutes
Consider routers as Running from the start of running an authority.
- TestingEstimatedDescriptorPropagationTime 0 minutes
Clients try downloading router descriptors from directory caches,
even when they are not 10 minutes old.
In addition to changing the defaults for these configuration options,
TestingTorNetwork can only be set when a user has manually configured
DirServer lines.
Test:
The implementation of this proposal must pass the following tests:
1. Set TestingTorNetwork and see if dependent configuration options are
correctly changed.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
250-TestingTorNetwork=1
250 TestingAuthDirTimeToLearnReachability=0
QUIT
2. Set TestingTorNetwork and a dependent configuration value to see if
the provided value is used for the dependent option.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
TestingAuthDirTimeToLearnReachability 5
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
250-TestingTorNetwork=1
250 TestingAuthDirTimeToLearnReachability=5
QUIT
3. Start with TestingTorNetwork set and change a dependent configuration
option later on.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
SETCONF TestingAuthDirTimeToLearnReachability=5
GETCONF TestingAuthDirTimeToLearnReachability
250 TestingAuthDirTimeToLearnReachability=5
QUIT
4. Start with TestingTorNetwork set and a dependent configuration value,
and reset that dependent configuration value. The result should be
the testing-network specific default value.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
TestingAuthDirTimeToLearnReachability 5
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingAuthDirTimeToLearnReachability
250 TestingAuthDirTimeToLearnReachability=5
RESETCONF TestingAuthDirTimeToLearnReachability
GETCONF TestingAuthDirTimeToLearnReachability
250 TestingAuthDirTimeToLearnReachability=0
QUIT
5. Leave TestingTorNetwork unset and check if dependent configuration
options are left unchanged.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
250-TestingTorNetwork=0
250 TestingAuthDirTimeToLearnReachability=1800
QUIT
6. Leave TestingTorNetwork unset, but set dependent configuration option
which should fail.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
TestingAuthDirTimeToLearnReachability 0
[warn] Failed to parse/validate config:
TestingAuthDirTimeToLearnReachability may only be changed in testing
Tor networks!
7. Start with TestingTorNetwork unset and change dependent configuration
option later on which should fail.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
SETCONF TestingAuthDirTimeToLearnReachability=0
513 Unacceptable option value: TestingAuthDirTimeToLearnReachability
may only be changed in testing Tor networks!
8. Start with TestingTorNetwork unset and set it later on which should
fail.
tor DataDirectory . ControlPort 9051 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
SETCONF TestingTorNetwork=1
553 Transition not allowed: While Tor is running, changing
TestingTorNetwork is not allowed.
9. Start with TestingTorNetwork set and unset it later on which should
fail.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
"mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
telnet 127.0.0.1 9051
AUTHENTICATE
RESETCONF TestingTorNetwork
513 Unacceptable option value: TestingV3AuthInitialVotingInterval may
only be changed in testing Tor networks!
10. Set TestingTorNetwork, but do not provide an alternate DirServer
which should fail.
tor DataDirectory . ControlPort 9051 TestingTorNetwork 1
[warn] Failed to parse/validate config: TestingTorNetwork may only be
configured in combination with a non-default set of DirServers.

View File

@ -1,100 +0,0 @@
Filename: 136-legacy-keys.txt
Title: Mass authority migration with legacy keys
Author: Nick Mathewson
Created: 13-May-2008
Status: Closed
Implemented-In: 0.2.0.x
Overview:
This document describes a mechanism to change the keys of more than
half of the directory servers at once without breaking old clients
and caches immediately.
Motivation:
If a single authority's identity key is believed to be compromised,
the solution is obvious: remove that authority from the list,
generate a new certificate, and treat the new cert as belonging to a
new authority. This approach works fine so long as less than 1/2 of
the authority identity keys are bad.
Unfortunately, the mass-compromise case is possible if there is a
sufficiently bad bug in Tor or in any OS used by a majority of v3
authorities. Let's be prepared for it!
We could simply stop using the old keys and start using new ones,
and tell all clients running insecure versions to upgrade.
Unfortunately, this breaks our cacheing system pretty badly, since
caches won't cache a consensus that they don't believe in. It would
be nice to have everybody become secure the moment they upgrade to a
version listing the new authority keys, _without_ breaking upgraded
clients until the caches upgrade.
So, let's come up with a way to provide a time window where the
consensuses are signed with the new keys and with the old.
Design:
We allow directory authorities to list a single "legacy key"
fingerprint in their votes. Each authority may add a single legacy
key. The format for this line is:
legacy-dir-key FINGERPRINT
We describe a new consensus method for generating directory
consensuses. This method is consensus method "3".
When the authorities decide to use method "3" (as described in 3.4.1
of dir-spec.txt), for every included vote with a legacy-dir-key line,
the consensus includes an extra dir-source line. The fingerprint in
this extra line is as in the legacy-dir-key line. The ports and
addresses are in the dir-source line. The nickname is as in the
dir-source line, with the string "-legacy" appended.
[We need to include this new dir-source line because the code
won't accept or preserve signatures from authorities not listed
as contributing to the consensus.]
Authorities using legacy dir keys include two signatures on their
consensuses: one generated with a signing key signed with their real
signing key, and another generated with a signing key signed with
another signing key attested to by their identity key. These
signing keys MUST be different. Authorities MUST serve both
certificates if asked.
Process:
In the event of a mass key failure, we'll follow the following
(ugly) procedure:
- All affected authorities generate new certificates and identity
keys, and circulate their new dirserver lines. They copy their old
certificates and old broken keys, but put them in new "legacy
key files".
- At the earliest time that can be arranged, the authorities
replace their signing keys, identity keys, and certificates
with the new uncompromised versions, and update to the new list
of dirserer lines.
- They add an "V3DirAdvertiseLegacyKey 1" option to their torrc.
- Now, new consensuses will be generated using the new keys, but
the results will also be signed with the old keys.
- Clients and caches are told they need to upgrade, and given a
time window to do so.
- At the end of the time window, authorities remove the
V3DirAdvertiseLegacyKey option.
Notes:
It might be good to get caches to cache consensuses that they do not
believe in. I'm not sure the best way of how to do this.
It's a superficially neat idea to have new signing keys and have
them signed by the new and by the old authority identity keys. This
breaks some code, though, and doesn't actually gain us anything,
since we'd still need to include each signature twice.
It's also a superficially neat idea, if identity keys and signing
keys are compromised, to at least replace all the signing keys.
I don't think this achieves us anything either, though.

View File

@ -1,235 +0,0 @@
Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
Implemented-In: 0.2.1.x
1. Overview.
Tor has many steps to bootstrapping directory information and
initial circuits, but from the controller's perspective we just have
a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
slow connections or with connectivity problems can wait a long time
staring at the yellow onion, wondering if it will ever change color.
This proposal describes a new client status event so Tor can give
more details to the controller. Section 2 describes the changes to the
controller protocol; Section 3 describes Tor's internal bootstrapping
phases when everything is going correctly; Section 4 describes when
Tor detects a problem and issues a bootstrap warning; Section 5 covers
suggestions for how controllers should display the results.
2. Controller event syntax.
The generic status event is:
"650" SP StatusType SP StatusSeverity SP StatusAction
[SP StatusArguments] CRLF
So in this case we send
650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
PROGRESS=num TAG=Keyword SUMMARY=String \
[WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]
The arguments MAY appear in any order. Controllers MUST accept unrecognized
arguments.
"Progress" gives a number between 0 and 100 for how far through
the bootstrapping process we are. "Summary" is a string that can be
displayed to the user to describe the *next* task that Tor will tackle,
i.e., the task it is working on after sending the status event. "Tag"
is an optional string that controllers can use to recognize bootstrap
phases from Section 3, if they want to do something smarter than just
blindly displaying the summary string.
The severity describes whether this is a normal bootstrap phase
(severity notice) or an indication of a bootstrapping problem
(severity warn). If severity warn, it should also include a "warning"
argument string with any hints Tor has to offer about why it's having
troubles bootstrapping, a "reason" string that lists one of the reasons
allowed in the ORConn event, a "count" number that tells how many
bootstrap problems there have been so far at this phase, and a
"recommendation" keyword to indicate how the controller ought to react.
3. The bootstrap phases.
This section describes the various phases currently reported by
Tor. Controllers should not assume that the percentages and tags listed
here will continue to match up, or even that the tags will stay in
the same order. Some phases might also be skipped (not reported) if the
associated bootstrap step is already complete, or if the phase no longer
is necessary. Only "starting" and "done" are guaranteed to exist in all
future versions.
Current Tor versions enter these phases in order, monotonically;
future Tors MAY revisit earlier stages.
Phase 0:
tag=starting summary="starting"
Tor starts out in this phase.
Phase 5:
tag=conn_dir summary="Connecting to directory mirror"
Tor sends this event as soon as Tor has chosen a directory mirror ---
one of the authorities if bootstrapping for the first time or after
a long downtime, or one of the relays listed in its cached directory
information otherwise.
Tor will stay at this phase until it has successfully established
a TCP connection with some directory mirror. Problems in this phase
generally happen because Tor doesn't have a network connection, or
because the local firewall is dropping SYN packets.
Phase 10
tag=handshake_dir summary="Finishing handshake with directory mirror"
This event occurs when Tor establishes a TCP connection with a relay used
as a directory mirror (or its https proxy if it's using one). Tor remains
in this phase until the TLS handshake with the relay is finished.
Problems in this phase generally happen because Tor's firewall is
doing more sophisticated MITM attacks on it, or doing packet-level
keyword recognition of Tor's handshake.
Phase 15:
tag=onehop_create summary="Establishing one-hop circuit for dir info"
Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
to establish a one-hop circuit for retrieving directory information.
It will remain in this phase until it receives the CREATED_FAST cell
back, indicating that the circuit is ready.
Phase 20:
tag=requesting_status summary="Asking for networkstatus consensus"
Once we've finished our one-hop circuit, we will start a new stream
for fetching the networkstatus consensus. We'll stay in this phase
until we get the 'connected' relay cell back, indicating that we've
established a directory connection.
Phase 25:
tag=loading_status summary="Loading networkstatus consensus"
Once we've established a directory connection, we will start fetching
the networkstatus consensus document. This could take a while; this
phase is a good opportunity for using the "progress" keyword to indicate
partial progress.
This phase could stall if the directory mirror we picked doesn't
have a copy of the networkstatus consensus so we have to ask another,
or it does give us a copy but we don't find it valid.
Phase 40:
tag=loading_keys summary="Loading authority key certs"
Sometimes when we've finished loading the networkstatus consensus,
we find that we don't have all the authority key certificates for the
keys that signed the consensus. At that point we put the consensus we
fetched on hold and fetch the keys so we can verify the signatures.
Phase 45
tag=requesting_descriptors summary="Asking for relay descriptors"
Once we have a valid networkstatus consensus and we've checked all
its signatures, we start asking for relay descriptors. We stay in this
phase until we have received a 'connected' relay cell in response to
a request for descriptors.
Phase 50:
tag=loading_descriptors summary="Loading relay descriptors"
We will ask for relay descriptors from several different locations,
so this step will probably make up the bulk of the bootstrapping,
especially for users with slow connections. We stay in this phase until
we have descriptors for at least 1/4 of the usable relays listed in
the networkstatus consensus. This phase is also a good opportunity to
use the "progress" keyword to indicate partial steps.
Phase 80:
tag=conn_or summary="Connecting to entry guard"
Once we have a valid consensus and enough relay descriptors, we choose
some entry guards and start trying to build some circuits. This step
is similar to the "conn_dir" phase above; the only difference is
the context.
If a Tor starts with enough recent cached directory information,
its first bootstrap status event will be for the conn_or phase.
Phase 85:
tag=handshake_or summary="Finishing handshake with entry guard"
This phase is similar to the "handshake_dir" phase, but it gets reached
if we finish a TCP connection to a Tor relay and we have already reached
the "conn_or" phase. We'll stay in this phase until we complete a TLS
handshake with a Tor relay.
Phase 90:
tag=circuit_create "Establishing circuits"
Once we've finished our TLS handshake with an entry guard, we will
set about trying to make some 3-hop circuits in case we need them soon.
Phase 100:
tag=done summary="Done"
A full 3-hop circuit has been established. Tor is ready to handle
application connections now.
4. Bootstrap problem events.
When an OR Conn fails, we send a "bootstrap problem" status event, which
is like the standard bootstrap status event except with severity warn.
We include the same progress, tag, and summary values as we would for
a normal bootstrap event, but we also include "warning", "reason",
"count", and "recommendation" key/value combos.
The "reason" values are long-term-stable controller-facing tags to
identify particular issues in a bootstrapping step. The warning
strings, on the other hand, are human-readable. Controllers SHOULD
NOT rely on the format of any warning string. Currently the possible
values for "recommendation" are either "ignore" or "warn" -- if ignore,
the controller can accumulate the string in a pile of problems to show
the user if the user asks; if warn, the controller should alert the
user that Tor is pretty sure there's a bootstrapping problem.
Currently Tor uses recommendation=ignore for the first nine bootstrap
problem reports for a given phase, and then uses recommendation=warn
for subsequent problems at that phase. Hopefully this is a good
balance between tolerating occasional errors and reporting serious
problems quickly.
5. Suggested controller behavior.
Controllers should start out with a yellow onion or the equivalent
("starting"), and then watch for either a bootstrap status event
(meaning the Tor they're using is sufficiently new to produce them,
and they should load up the progress bar or whatever they plan to use
to indicate progress) or a circuit_established status event (meaning
bootstrapping is finished).
In addition to a progress bar in the display, controllers should also
have some way to indicate progress even when no controller window is
open. For example, folks using Tor Browser Bundle in hostile Internet
cafes don't want a big splashy screen up. One way to let the user keep
informed of progress in a more subtle way is to change the task tray
icon and/or tooltip string as more bootstrap events come in.
Controllers should also have some mechanism to alert their user when
bootstrapping problems are reported. Perhaps we should gather a set of
help texts and the controller can send the user to the right anchor in a
"bootstrapping problems" page in the controller's help subsystem?
6. Getting up to speed when the controller connects.
There's a new "GETINFO /status/bootstrap-phase" option, which returns
the most recent bootstrap phase status event sent. Specifically,
it returns a string starting with either "NOTICE BOOTSTRAP ..." or
"WARN BOOTSTRAP ...".
Controllers should use this getinfo when they connect or attach to
Tor to learn its current state.

View File

@ -1,49 +0,0 @@
Filename: 138-remove-down-routers-from-consensus.txt
Title: Remove routers that are not Running from consensus documents
Author: Peter Palfrader
Created: 11-Jun-2008
Status: Closed
Implemented-In: 0.2.1.2-alpha
1. Overview.
Tor directory authorities hourly vote and agree on a consensus document
which lists all the routers on the network together with some of their
basic properties, like if a router is an exit node, whether it is
stable or whether it is a version 2 directory mirror.
One of the properties given with each router is the 'Running' flag.
Clients do not use routers that are not listed as running.
This proposal suggests that routers without the Running flag are not
listed at all.
2. Current status
At a typical bootstrap a client downloads a 140KB consensus, about
10KB of certificates to verify that consensus, and about 1.6MB of
server descriptors, about 1/4 of which it requires before it will
start building circuits.
Another proposal deals with how to get that huge 1.6MB fraction to
effectively zero (by downloading only individual descriptors, on
demand). Should that get successfully implemented that will leave the
140KB compressed consensus as a large fraction of what a client needs
to get in order to work.
About one third of the routers listed in a consensus are not running
and will therefore never be used by clients who use this consensus.
Not listing those routers will save about 30% to 40% in size.
3. Proposed change
Authority directory servers produce vote documents that include all
the servers they know about, running or not, like they currently
do. In addition these vote documents also state that the authority
supports a new consensus forming method (method number 4).
If more than two thirds of votes that an authority has received claim
they support method 4 then this new method will be used: The
consensus document is formed like before but a new last step removes
all routers from the listing that are not marked as Running.

View File

@ -1,94 +0,0 @@
Filename: 139-conditional-consensus-download.txt
Title: Download consensus documents only when it will be trusted
Author: Peter Palfrader
Created: 2008-04-13
Status: Closed
Implemented-In: 0.2.1.x
Overview:
Servers only provide consensus documents to clients when it is known that
the client will trust it.
Motivation:
When clients[1] want a new network status consensus they request it
from a Tor server using the URL path /tor/status-vote/current/consensus.
Then after downloading the client checks if this consensus can be
trusted. Whether the client trusts the consensus depends on the
authorities that the client trusts and how many of those
authorities signed the consensus document.
If the client cannot trust the consensus document it is disregarded
and a new download is tried at a later time. Several hundred
kilobytes of server bandwidth were wasted by this single client's
request.
With hundreds of thousands of clients this will have undesirable
consequences when the list of authorities has changed so much that a
large number of established clients no longer can trust any consensus
document formed.
Objective:
The objective of this proposal is to make clients not download
consensuses they will not trust.
Proposal:
The list of authorities that are trusted by a client are encoded in
the URL they send to the directory server when requesting a consensus
document.
The directory server then only sends back the consensus when more than
half of the authorities listed in the request have signed the
consensus. If it is known that the consensus will not be trusted
a 404 error code is sent back to the client.
This proposal does not require directory caches to keep more than one
consensus document. This proposal also does not require authorities
to verify the signature on the consensus document of authorities they
do not recognize.
The new URL scheme to download a consensus is
/tor/status-vote/current/consensus/<F> where F is a list of
fingerprints, sorted in ascending order, and concatenated using a +
sign.
Fingerprints are uppercase hexadecimal encodings of the authority
identity key's digest. Servers should also accept requests that
use lower case or mixed case hexadecimal encodings.
A .z URL for compressed versions of the consensus will be provided
similarly to existing resources and is the URL that usually should
be used by clients.
Migration:
The old location of the consensus should continue to work
indefinitely. Not only is it used by old clients, but it is a useful
resource for automated tools that do not particularly care which
authorities have signed the consensus.
Authorities that are known to the client a priori by being shipped
with the Tor code are assumed to handle this format.
When downloading a consensus document from caches that do not support this
new format they fall back to the old download location.
Caches support the new format starting with Tor version 0.2.1.1-alpha.
Anonymity Implications:
By supplying the list of authorities a client trusts to the directory
server we leak information (like likely version of Tor client) to the
directory server. In the current system we also leak that we are
very old - by re-downloading the consensus over and over again, but
only when we are so old that we no longer can trust the consensus.
Footnotes:
1. For the purpose of this proposal a client can be any Tor instance
that downloads a consensus document. This includes relays,
directory caches as well as end users.

View File

@ -1,156 +0,0 @@
Filename: 140-consensus-diffs.txt
Title: Provide diffs between consensuses
Author: Peter Palfrader
Created: 13-Jun-2008
Status: Accepted
Target: 0.2.2.x
0. History
22-May-2009: Restricted the ed format even more strictly for ease of
implementation. -nickm
1. Overview.
Tor clients and servers need a list of which relays are on the
network. This list, the consensus, is created by authorities
hourly and clients fetch a copy of it, with some delay, hourly.
This proposal suggests that clients download diffs of consensuses
once they have a consensus instead of hourly downloading a full
consensus.
2. Numbers
After implementing proposal 138 which removes nodes that are not
running from the list a consensus document is about 92 kilobytes
in size after compression.
The diff between two consecutive consensus, in ed format, is on
average 13 kilobytes compressed.
3. Proposal
3.1 Clients
If a client has a consensus that is recent enough it SHOULD
try to download a diff to get the latest consensus rather than
fetching a full one.
[XXX: what is recent enough?
time delta in hours / size of compressed diff
0 20
1 9650
2 17011
3 23150
4 29813
5 36079
6 39455
7 43903
8 48907
9 54549
10 60057
11 67810
12 71171
13 73863
14 76048
15 80031
16 84686
17 89862
18 94760
19 94868
20 94223
21 93921
22 92144
23 90228
[ size of gzip compressed "diff -e" between the consensus on
2008-06-01-00:00:00 and the following consensuses that day.
Consensuses have been modified to exclude down routers per
proposal 138. ]
Data suggests that for the first few hours diffs are very useful,
saving about 60% for the first three hours, 30% for the first 10,
and almost nothing once we are past 16 hours.
]
3.2 Servers
Directory authorities and servers need to keep up to X [XXX: depends
on how long clients try to download diffs per above] old consensus
documents so they can build diffs. They should offer a diff to the
most recent consensus at the URL
http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
where hash is the full digest of the consensus the client currently
has, and FPRLIST is a list of (abbreviated) fingerprints of
authorities the client trusts.
Servers will only return a consensus if more than half of the requested
authorities have signed the document, otherwise a 404 error will be sent
back. The fingerprints can be shortened to a length of any multiple of
two, using only the leftmost part of the encoded fingerprint. Tor uses
3 bytes (6 hex characters) of the fingerprint. (This is just like the
conditional consensus downloads that Tor supports starting with
0.1.2.1-alpha.)
If a server cannot offer a diff from the consensus identified by the
hash but has a current consensus it MUST return the full consensus.
[XXX: what should we do when the client already has the latest
consensus? I can think of the following options:
- send back 3xx not modified
- send back 200 ok and an empty diff
- send back 404 nothing newer here.
I currently lean towards the empty diff.]
4. Diff Format
Diffs start with the token "network-status-diff-version" followed by a
space and the version number, currently "1".
If a document does not start with network-status-diff it is assumed
to be a full consensus download and would therefore currently start
with "network-status-version 3".
Following the network-status-diff header line is a diff, or patch, in
limited ed format. We choose this format because it is easy to create
and process with standard tools (patch, diff -e, ed). This will help
us in developing and testing this proposal and it should make future
debugging easier.
[ If at one point in the future we decide that the space benefits from
a custom diff format outweighs these benefits we can always
introduce a new diff format and offer it at for instance
../diff2/... ]
We support the following ed commands, each on a line by itself:
- "<n1>d" Delete line n1
- "<n1>,<n2>d" Delete lines n1 through n2, including
- "<n1>c" Replace line n1 with the following block
- "<n1>,<n2>c" Replace lines n1 through n2, including, with the
following block.
- "<n1>a" Append the following block after line n1.
- "a" Append the following block after the current line.
- "s/.//" Remove the first character in the current line.
Note that line numbers always apply to the file after all previous
commands have already been applied.
The commands MUST apply to the file from back to front, such that
lines are only ever referred to by their position in the original
file.
The "current line" is either the first line of the file, if this is
the first command, the last line of a block we added in an append or
change command, or the line immediate following a set of lines we just
deleted (or the last line of the file if there are no lines after
that).
The replace and append command take blocks. These blocks are simply
appended to the diff after the line with the command. A line with
just a period (".") ends the block (and is not part of the lines
to add). Note that it is impossible to insert a line with just
a single dot. Recommended procedure is to insert a line with
two dots, then remove the first character of that line using s/.//.

View File

@ -1,323 +0,0 @@
Filename: 141-jit-sd-downloads.txt
Title: Download server descriptors on demand
Author: Peter Palfrader
Created: 15-Jun-2008
Status: Draft
1. Overview
Downloading all server descriptors is the most expensive part
of bootstrapping a Tor client. These server descriptors currently
amount to about 1.5 Megabytes of data, and this size will grow
linearly with network size.
Fetching all these server descriptors takes a long while for people
behind slow network connections. It is also a considerable load on
our network of directory mirrors.
This document describes proposed changes to the Tor network and
directory protocol so that clients will no longer need to download
all server descriptors.
These changes consist of moving load balancing information into
network status documents, implementing a means to download server
descriptors on demand in an anonymity-preserving way, and dealing
with exit node selection.
2. What is in a server descriptor
When a Tor client starts the first thing it will try to get is a
current network status document: a consensus signed by a majority
of directory authorities. This document is currently about 100
Kilobytes in size, tho it will grow linearly with network size.
This document lists all servers currently running on the network.
The Tor client will then try to get a server descriptor for each
of the running servers. All server descriptors currently amount
to about 1.5 Megabytes of downloads.
A Tor client learns several things about a server from its descriptor.
Some of these it already learned from the network status document
published by the authorities, but the server descriptor contains it
again in a single statement signed by the server itself, not just by
the directory authorities.
Tor clients use the information from server descriptors for
different purposes, which are considered in the following sections.
#three ways: One, to determine if a server will be able to handle
#this client's request; two, to actually communicate or use the server;
#three, for load balancing decisions.
#
#These three points are considered in the following subsections.
2.1 Load balancing
The Tor load balancing mechanism is quite complex in its details, but
it has a simple goal: The more traffic a server can handle the more
traffic it should get. That means the more traffic a server can
handle the more likely a client will use it.
For this purpose each server descriptor has bandwidth information
which tries to convey a server's capacity to clients.
Currently we weigh servers differently for different purposes. There
is a weight for when we use a server as a guard node (our entry to the
Tor network), there is one weight we assign servers for exit duties,
and a third for when we need intermediate (middle) nodes.
2.2 Exit information
When a Tor wants to exit to some resource on the internet it will
build a circuit to an exit node that allows access to that resource's
IP address and TCP Port.
When building that circuit the client can make sure that the circuit
ends at a server that will be able to fulfill the request because the
client already learned of all the servers' exit policies from their
descriptors.
2.3 Capability information
Server descriptors contain information about the specific version of
the Tor protocol they understand [proposal 105].
Furthermore the server descriptor also contains the exact version of
the Tor software that the server is running and some decisions are
made based on the server version number (for instance a Tor client
will only make conditional consensus requests [proposal 139] when
talking to Tor servers version 0.2.1.1-alpha or later).
2.4 Contact/key information
A server descriptor lists a server's IP address and TCP ports on which
it accepts onion and directory connections. Furthermore it contains
the onion key (a short lived RSA key to which clients encrypt CREATE
cells).
2.5 Identity information
A Tor client learns the digest of a server's key from the network
status document. Once it has a server descriptor this descriptor
contains the full RSA identity key of the server. Clients verify
that 1) the digest of the identity key matches the expected digest
it got from the consensus, and 2) that the signature on the descriptor
from that key is valid.
3. No longer require clients to have copies of all SDs
3.1 Load balancing info in consensus documents
One of the reasons why clients download all server descriptors is for
doing load proper load balancing as described in 2.1. In order for
clients to not require all server descriptors this information will
have to move into the network status document.
Consensus documents will have a new line per router similar
to the "r", "s", and "v" lines that already exist. This line
will convey weight information to clients.
"w Bandwidth=193"
The bandwidth number is the lesser of observed bandwidth and bandwidth
rate limit from the server descriptor that the "r" line referenced by
digest (1st and 3rd field of the bandwidth line in the descriptor).
It is given in kilobytes per second so the byte value in the
descriptor has to be divided by 1024 (and is then truncated, i.e.
rounded down).
Authorities will cap the bandwidth number at some arbitrary value,
currently 10MB/sec. If a router claims a larger bandwidth an
authority's vote will still only show Bandwidth=10240.
The consensus value for bandwidth is the median of all bandwidth
numbers given in votes. In case of an even number of votes we use
the lower median. (Using this procedure allows us to change the
cap value more easily.)
Clients should believe the bandwidth as presented in the consensus,
not capping it again.
3.2 Fetching descriptors on demand
As described in 2.4 a descriptor lists IP address, OR- and Dir-Port,
and the onion key for a server.
A client already knows the IP address and the ports from the consensus
documents, but without the onion key it will not be able to send
CREATE/EXTEND cells for that server. Since the client needs the onion
key it needs the descriptor.
If a client only downloaded a few descriptors in an observable manner
then that would leak which nodes it was going to use.
This proposal suggests the following:
1) when connecting to a guard node for which the client does not
yet have a cached descriptor it requests the descriptor it
expects by hash. (The consensus document that the client holds
has a hash for the descriptor of this server. We want exactly
that descriptor, not a different one.)
It does that by sending a RELAY_REQUEST_SD cell.
A client MAY cache the descriptor of the guard node so that it does
not need to request it every single time it contacts the guard.
2) when a client wants to extend a circuit that currently ends in
server B to a new next server C, the client will send a
RELAY_REQUEST_SD cell to server B. This cell contains in its
payload the hash of a server descriptor the client would like
to obtain (C's server descriptor). The server sends back the
descriptor and the client can now form a valid EXTEND/CREATE cell
encrypted to C's onion key.
Clients MUST NOT cache such descriptors. If they did they might
leak that they already extended to that server at least once
before.
Replies to RELAY_REQUEST_SD requests need to be padded to some
constant upper limit in order to conceal a client's destination
from anybody who might be counting cells/bytes.
RELAY_REQUEST_SD cells contain the following information:
- hash of the server descriptor requested
- hash of the identity digest of the server for which we want the SD
- IP address and OR-port or the server for which we want the SD
- padding factor - the number of cells we want the answer
padded to.
[XXX this just occured to me and it might be smart. or it might
be stupid. clients would learn the padding factor they want
to use from the consensus document. This allows us to grow
the replies later on should SDs become larger.]
[XXX: figure out a decent padding size]
3.3 Protocol versions
Server descriptors contain optional information of supported
link-level and circuit-level protocols in the form of
"opt protocols Link 1 2 Circuit 1". These are not currently needed
and will probably eventually move into the "v" (version) line in
the consensus. This proposal does not deal with them.
Similarly a server descriptor contains the version number of
a Tor node. This information is already present in the consensus
and is thus available to all clients immediately.
3.4 Exit selection
Currently finding an appropriate exit node for a user's request is
easy for a client because it has complete knowledge of all the exit
policies of all servers on the network.
The consensus document will once again be extended to contain the
information required by clients. This information will be a summary
of each node's exit policy. The exit policy summary will only contain
the list of ports to which a node exits to most destination IP
addresses.
A summary should claim a router exits to a specific TCP port if,
ignoring private IP addresses, the exit policy indicates that the
router would exit to this port to most IP address. either two /8
netblocks, or one /8 and a couple of /12s or any other combination).
The exact algorith used is this: Going through all exit policy items
- ignore any accept that is not for all IP addresses ("*"),
- ignore rejects for these netblocks (exactly, no subnetting):
0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8,
and 172.16.0.0/12m
- for each reject count the number of IP addresses rejected against
the affected ports,
- once we hit an accept for all IP addresses ("*") add the ports in
that policy item to the list of accepted ports, if they don't have
more than 2^25 IP addresses (that's two /8 networks) counted
against them (i.e. if the router exits to a port to everywhere but
at most two /8 networks).
An exit policy summary will be included in votes and consensus as a
new line attached to each exit node. The line will have the format
"p" <space> "accept"|"reject" <portlist>
where portlist is a comma seperated list of single port numbers or
portranges (e.g. "22,80-88,1024-6000,6667").
Whether the summary shows the list of accepted ports or the list of
rejected ports depends on which list is shorter (has a shorter string
representation). In case of ties we choose the list of accepted
ports. As an exception to this rule an allow-all policy is
represented as "accept 1-65535" instead of "reject " and a reject-all
policy is similarly given as "reject 1-65535".
Summary items are compressed, that is instead of "80-88,89-100" there
only is a single item of "80-100", similarly instead of "20,21" a
summary will say "20-21".
Port lists are sorted in ascending order.
The maximum allowed length of a policy summary (including the "accept "
or "reject ") is 1000 characters. If a summary exceeds that length we
use an accept-style summary and list as much of the port list as is
possible within these 1000 bytes.
3.4.1 Consensus selection
When building a consensus, authorities have to agree on a digest of
the server descriptor to list in the router line for each router.
This is documented in dir-spec section 3.4.
All authorities that listed that agreed upon descriptor digest in
their vote should also list the same exit policy summary - or list
none at all if the authority has not been upgraded to list that
information in their vote.
If we have votes with matching server descriptor digest of which at
least one of them has an exit policy then we differ between two cases:
a) all authorities agree (or abstained) on the policy summary, and we
use the exit policy summary that they all listed in their vote,
b) something went wrong (or some authority is playing foul) and we
have different policy summaries. In that case we pick the one
that is most commonly listed in votes with the matching
descriptor. We break ties in favour of the lexigraphically larger
vote.
If none one of the votes with a matching server descriptor digest has
an exit policy summary we use the most commonly listed one in all
votes, breaking ties like in case b above.
3.4.2 Client behaviour
When choosing an exit node for a specific request a Tor client will
choose from the list of nodes that exit to the requested port as given
by the consensus document. If a client has additional knowledge (like
cached full descriptors) that indicates the so chosen exit node will
reject the request then it MAY use that knowledge (or not include such
nodes in the selection to begin with). However, clients MUST NOT use
nodes that do not list the port as accepted in the summary (but for
which they know that the node would exit to that address from other
sources, like a cached descriptor).
An exception to this is exit enclave behaviour: A client MAY use the
node at a specific IP address to exit to any port on the same address
even if that node is not listed as exiting to the port in the summary.
4. Migration
4.1 Consensus document changes.
The consensus will need to include
- bandwidth information (see 3.1)
- exit policy summaries (3.4)
A new consensus method (number TBD) will be chosen for this.
5. Future possibilities
This proposal still requires that all servers have the descriptors of
every other node in the network in order to answer RELAY_REQUEST_SD
cells. These cells are sent when a circuit is extended from ending at
node B to a new node C. In that case B would have to answer a
RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest).
In order to answer that request B obviously needs a copy of C's server
descriptor. The RELAY_REQUEST_SD cell already has all the info that
B needs to contact C so it can ask about the descriptor before passing it
back to the client.

View File

@ -1,277 +0,0 @@
Filename: 142-combine-intro-and-rend-points.txt
Title: Combine Introduction and Rendezvous Points
Author: Karsten Loesing, Christian Wilms
Created: 27-Jun-2008
Status: Dead
Change history:
27-Jun-2008 Initial proposal for or-dev
04-Jul-2008 Give first security property the new name "Responsibility"
and change new cell formats according to rendezvous protocol
version 3 draft.
19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of
circuits between multiple clients is not supported by Tor.
Overview:
Establishing a connection to a hidden service currently involves two Tor
relays, introduction and rendezvous point, and 10 more relays distributed
over four circuits to connect to them. The introduction point is
established in the mid-term by a hidden service to transfer introduction
requests from client to the hidden service. The rendezvous point is set
up by the client for a single hidden service request and actually
transfers end-to-end encrypted application data between client and hidden
service.
There are some reasons for separating the two roles of introduction and
rendezvous point: (1) Responsibility: A relay shall not be made
responsible that it relays data for a certain hidden service; in the
original design as described in [1] an introduction point relays no
application data, and a rendezvous points neither knows the hidden
service nor can it decrypt the data. (2) Scalability: The hidden service
shall not have to maintain a number of open circuits proportional to the
expected number of client requests. (3) Attack resistance: The effect of
an attack on the only visible parts of a hidden service, its introduction
points, shall be as small as possible.
However, elimination of a separate rendezvous connection as proposed by
Øverlier and Syverson [2] is the most promising approach to improve the
delay in connection establishment. From all substeps of connection
establishment extending a circuit by only a single hop is responsible for
a major part of delay. Reducing on-demand circuit extensions from two to
one results in a decrease of mean connection establishment times from 39
to 29 seconds [3]. Particularly, eliminating the delay on hidden-service
side allows the client to better observe progress of connection
establishment, thus allowing it to use smaller timeouts. Proposal 114
introduced new introduction keys for introduction points and provides for
user authorization data in hidden service descriptors; it will be shown
in this proposal that introduction keys in combination with new
introduction cookies provide for the first security property
responsibility. Further, eliminating the need for a separate introduction
connection benefits the overall network load by decreasing the number of
circuit extensions. After all, having only one connection between client
and hidden service reduces the overall protocol complexity.
Design:
1. Hidden Service Configuration
Hidden services should be able to choose whether they would like to use
this protocol. This might be opt-in for 0.2.1.x and opt-out for later
major releases.
2. Contact Point Establishment
When preparing a hidden service, a Tor client selects a set of relays to
act as contact points instead of introduction points. The contact point
combines both roles of introduction and rendezvous point as proposed in
[2]. The only requirement for a relay to be picked as contact point is
its capability of performing this role. This can be determined from the
Tor version number that needs to be equal or higher than the first
version that implements this proposal.
The easiest way to implement establishment of contact points is to
introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes
version 2 ESTABLISH_INTRO cells as requests to establish a contact point
rather than an introduction point.
V Format byte: set to 255 [1 octet]
V Version byte: set to 2 [1 octet]
KLEN Key length [2 octets]
PK Public introduction key [KLEN octets]
HS Hash of session info [20 octets]
SIG Signature of above information [variable]
The hidden service does not create a fixed number of contact points, like
3 in the current protocol. It uses a minimum of 3 contact points, but
increases this number depending on the history of client requests within
the last hour. The hidden service also increases this number depending on
the frequency of failing contact points in order to defend against
attacks on its contact points. When client authorization as described in
proposal 121 is used, a hidden service can also use the number of
authorized clients as first estimate for the required number of contact
points.
3. Hidden Service Descriptor Creation
A hidden service needs to issue a fresh introduction cookie for each
established introduction point. By requiring clients to use this cookie
in a later connection establishment, an introduction point cannot access
the hidden service that it works for. Together with the fresh
introduction key that was introduced in proposal 114, this reduces
responsibility of a contact point for a specific hidden service.
The v2 hidden service descriptor format contains an
"intro-authentication" field that may contain introduction-point specific
keys. The hidden service creates a random string, comparable to the
rendezvous cookie, and includes it in the descriptor as introduction
cookie for auth-type "1". By convention, clients recognize existence of
auth-type 1 as possibility to connect to a hidden service via a contact
point rather than an introduction point. Older clients that do not
understand this new protocol simply ignore that cookie.
4. Connection Establishment
When establishing a connection to a hidden service a client learns about
the capability of using the new protocol from the hidden service
descriptor. It may choose whether to use this new protocol or not,
whereas older clients cannot understand the new capability and can only
use the current protocol. Client using version 0.2.1.x should be able to
opt-in for using the new protocol, which should change to opt-out for
later major releases.
When using the new capability the client creates a v2 INTRODUCE1 cell
that extends an unversioned INTRODUCE1 cell by adding the content of an
ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the
new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point,
because unversioned and versioned INTRODUCE1 cells are indistinguishable:
Cleartext
V Version byte: set to 2 [1 octet]
PK_ID Identifier for Bob's PK [20 octets]
RC Rendezvous cookie [20 octets]
Encrypted to introduction key:
VER Version byte: set to 3. [1 octet]
AUTHT The auth type that is supported [1 octet]
AUTHL Length of auth data [2 octets]
AUTHD Auth data [variable]
RC Rendezvous cookie [20 octets]
g^x Diffie-Hellman data, part 1 [128 octets]
The cleartext part contains the rendezvous cookie that the contact point
remembers just as a rendezvous point would do.
The encrypted part contains the introduction cookie as auth data for the
auth type 1. The rendezvous cookie is contained as before, but there is
no further rendezvous point information, as there is no separate
rendezvous point.
5. Rendezvous Establishment
The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a
request to be used in the new protocol. It remembers the contained
rendezvous cookie, replies to the client with an INTRODUCE_ACK cell
(omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted
part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service.
6. Introduction at Hidden Service
The hidden services recognizes an INTRODUCE2 cell containing an
introduction cookie as authorization data. In this case, it does not
extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell
directly back to its contact point as usual.
7. Rendezvous at Contact Point
The contact point processes a RENDEZVOUS1 cell just as a rendezvous point
does. The only difference is that the hidden-service-side circuit is not
exclusive for the client connection, but shared among multiple client
connections.
[Tor does not allow sharing of a single circuit among multiple client
connections easily. We need to think about a smart and efficient way to
implement this. Comment by Nick. -KL]
Security Implications:
(1) Responsibility
One of the original reasons for the separation of introduction and
rendezvous points is that a relay shall not be made responsible that it
relays data for a certain hidden service. In the current design an
introduction point relays no application data and a rendezvous points
neither knows the hidden service nor can it decrypt the data.
This property is also fulfilled in this new design. A contact point only
learns a fresh introduction key instead of the hidden service key, so
that it cannot recognize a hidden service. Further, the introduction
cookie, which is unknown to the contact point, prevents it from accessing
the hidden service itself. The only way for a contact point to access a
hidden service is to look up whether it is contained in the descriptors
of known hidden services. A contact point cannot directly be made
responsible for which hidden service it is working. In addition to that,
it cannot learn the data that it transfers, because all communication
between client and hidden service are end-to-end encrypted.
(2) Scalability
Another goal of the existing hidden service protocol is that a hidden
service does not have to maintain a number of open circuits proportional
to the expected number of client requests. The rationale behind this is
better scalability.
The new protocol eliminates the need for a hidden service to extend
circuits on demand, which has a positive effect on circuits establishment
times and overall network load. The solution presented here to establish
a number of contact points proportional to the history of connection
requests reduces the number of circuits to a minimum number that fits the
hidden service's needs.
(3) Attack resistance
The third goal of separating introduction and rendezvous points is to
limit the effect of an attack on the only visible parts of a hidden
service which are the contact points in this protocol.
In theory, the new protocol is more vulnerable to this attack. An
attacker who can take down a contact point does not only eliminate an
access point to the hidden service, but also breaks current client
connections to the hidden service using that contact point.
Øverlier and Syverson proposed the concept of valet nodes as additional
safeguard for introduction/contact points [4]. Unfortunately, this
increases hidden service protocol complexity conceptually and from an
implementation point of view. Therefore, it is not included in this
proposal.
However, in practice attacking a contact point (or introduction point) is
not as rewarding as it might appear. The cost for a hidden service to set
up a new contact point and publish a new hidden service descriptor is
minimal compared to the efforts necessary for an attacker to take a Tor
relay down. As a countermeasure to further frustrate this attack, the
hidden service raises the number of contact points as a function of
previous contact point failures.
Further, the probability of breaking client connections due to attacking
a contact point is minimal. It can be assumed that the probability of one
of the other five involved relays in a hidden service connection failing
or being shut down is higher than that of a successful attack on a
contact point.
(4) Resistance against Locating Attacks
Clients are no longer able to force a hidden service to create or extend
circuits. This further reduces an attacker's capabilities of locating a
hidden server as described by Øverlier and Syverson [5].
Compatibility:
The presented protocol does not raise compatibility issues with current
Tor versions. New relay versions support both, the existing and the
proposed protocol as introduction/rendezvous/contact points. A contact
point acts as introduction point simultaneously. Hidden services and
clients can opt-in to use the new protocol which might change to opt-out
some time in the future.
References:
[1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The
Second-Generation Onion Router. In the Proceedings of the 13th USENIX
Security Symposium, August 2004.
[2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity
of Tor Circuit Establishment and Hidden Services. In the Proceedings of
the Seventh Workshop on Privacy Enhancing Technologies (PET 2007),
Ottawa, Canada, June 2007.
[3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at
Better Performance, diploma thesis, June 2008, University of Bamberg.
[4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden
Servers with a Personal Touch. In the Proceedings of the Sixth Workshop
on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006.
[5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the
Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.

View File

@ -1,194 +0,0 @@
Filename: 143-distributed-storage-improvements.txt
Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
Author: Karsten Loesing
Created: 28-Jun-2008
Status: Open
Target: 0.2.1.x
Change history:
28-Jun-2008 Initial proposal for or-dev
Overview:
An evaluation of the distributed storage for Tor hidden service
descriptors and subsequent discussions have brought up a few improvements
to proposal 114. All improvements are backwards compatible to the
implementation of proposal 114.
Design:
1. Report Bad Directory Nodes
Bad hidden service directory nodes could deny existence of previously
stored descriptors. A bad directory node that does this with all stored
descriptors causes harm to the distributed storage in general, but
replication will cope with this problem in most cases. However, an
adversary that attempts to make a specific hidden service unavailable by
running relays that become responsible for all of a service's
descriptors poses a more serious threat. The distributed storage needs to
defend against this attack by detecting and removing bad directory nodes.
As a countermeasure hidden services try to download their descriptors
every hour at random times from the hidden service directories that are
responsible for storing it. If a directory node replies with 404 (Not
found), the hidden service reports the supposedly bad directory node to
a random selection of half of the directory authorities (with version
numbers equal to or higher than the first version that implements this
proposal). The hidden service posts a complaint message using HTTP 'POST'
to a URL "/tor/rendezvous/complain" with the following message format:
"hidden-service-directory-complaint" identifier NL
[At start, exactly once]
The identifier of the hidden service directory node to be
investigated.
"rendezvous-service-descriptor" descriptor NL
[At end, Excatly once]
The hidden service descriptor that the supposedly bad directory node
does not serve.
The directory authority checks if the descriptor is valid and the hidden
service directory responsible for storing it. It waits for a random time
of up to 30 minutes before posting the descriptor to the hidden service
directory. If the publication is acknowledged, the directory authority
waits another random time of up to 30 minutes before attempting to
request the descriptor that it has posted. If the directory node replies
with 404 (Not found), it will be blacklisted for being a hidden service
directory node for the next 48 hours.
A blacklisted hidden service directory is assigned the new flag BadHSDir
instead of the HSDir flag in the vote that a directory authority creates.
In a consensus a relay is only assigned a HSDir flag if the majority of
votes contains a HSDir flag and no more than one third of votes contains
a BadHSDir flag. As a result, clients do not have to learn about the
BadHSDir flag. A blacklisted directory node will simply not be assigned
the HSDir flag in the consensus.
In order to prevent an attacker from setting up new nodes as replacement
for blacklisted directory nodes, all directory nodes in the same /24
subnet are blacklisted, too. Furthermore, if two or more directory nodes
are blacklisted in the same /16 subnet concurrently, all other directory
nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at
most 48 hours.
2. Publish Fewer Replicas
The evaluation has shown that the probability of a directory node to
serve a previously stored descriptor is 85.7% (more precisely, this is
the 0.001-quantile of the empirical distribution with the rationale that
it holds for 99.9% of all empirical cases). If descriptors are replicated
to x directory nodes, the probability of at least one of the replicas to
be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an
overall availability of 99.9%, x = 3.55 replicas need to be stored. From
this follows that 4 replicas are sufficient, rather than the currently
stored 6 replicas.
Further, the current design stores 2 sets of descriptors on 3 directory
nodes with consecutive identities. Originally, this was meant to
facilitate replication between directory nodes, which has not been and
will not be implemented (the selection criterion of 24 hours uptime does
not make it necessary). As a result, storing descriptors on directory
nodes with consecutive identities is not required. In fact it should be
avoided to enable an attacker to create "black holes" in the identifier
ring.
Hidden services should store their descriptors on 4 non-consecutive
directory nodes, and clients should request descriptors from these
directory nodes only. For compatibility reasons, hidden services also
store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x
clients will be able to retrieve 4 out of 6 descriptors, but will fail
for the remaining 2 descriptors, which is sufficient for reliability. As
soon as 0.2.0.x is deprecated, hidden services can stop publishing the
additional 2 replicas.
3. Change Default Value of Being Hidden Service Directory
The requirements for becoming a hidden service directory node are an open
directory port and an uptime of at least 24 hours. The evaluation has
shown that there are 300 hidden service directory candidates in the mean,
but only 6 of them are configured to act as hidden service directories.
This is bad, because those 6 nodes need to serve a large share of all
hidden service descriptors. Optimally, there should be hundreds of hidden
service directories. Having a large number of 0.2.1.x directory nodes
also has a positive effect on 0.2.0.x hidden services and clients.
Therefore, the new default of HidServDirectoryV2 should be 1, so that a
Tor relay that has an open directory port automatically accepts and
serves v2 hidden service descriptors. A relay operator can still opt-out
running a hidden service directory by changing HidServDirectoryV2 to 0.
The additional bandwidth requirements for running a hidden service
directory node in addition to being a directory cache are negligible.
4. Make Descriptors Persistent on Directory Nodes
Hidden service directories that are restarted by their operators or after
a failure will not be selected as hidden service directories within the
next 24 hours. However, some clients might still think that these nodes
are responsible for certain descriptors, because they work on the basis
of network consensuses that are up to three hours old. The directory
nodes should be able to serve the previously received descriptors to
these clients. Therefore, directory nodes make all received descriptors
persistent and load previously received descriptors on startup.
5. Store and Serve Descriptors Regardless of Responsibility
Currently, directory nodes only accept descriptors for which they think
they are responsible. This may lead to problems when a directory node
uses an older or newer network consensus than hidden service or client
or when a directory node has been restarted recently. In fact, there are
no security issues in storing or serving descriptors for which a
directory node thinks it is not responsible. To the contrary, doing so
may improve reliability in border cases. As a result, a directory node
does not pay attention to responsibilty when receiving a publication or
fetch request, but stores or serves the requested descriptor. Likewise,
the directory node does not remove descriptors when it thinks it is not
responsible for them any more.
6. Avoid Periodic Descriptor Re-Publication
In the current implementation a hidden service re-publishes its
descriptor either when its content changes or an hour elapses. However,
the evaluation has shown that failures of hidden service directory nodes,
i.e. of nodes that have not failed within the last 24 hours, are very
rare. Together with making descriptors persistent on directory nodes,
there is no necessity to re-publish descriptors hourly.
The only two events leading to descriptor re-publication should be a
change of the descriptor content and a new directory node becoming
responsible for the descriptor. Hidden services should therefore consider
re-publication every time they learn about a new network consensus
instead of hourly.
7. Discard Expired Descriptors
The current implementation lets directory nodes keep a descriptor for two
days before discarding it. However, with the v2 design, descriptors are
only valid for at most one day. Directory nodes should determine the
validity of stored descriptors and discard them one hour after they have
expired (to compensate wrong clocks on clients).
8. Shorten Client-Side Descriptor Fetch History
When clients try to download a hidden service descriptor, they memorize
fetch requests to directory nodes for up to 15 minutes. This allows them
to request all replicas of a descriptor to avoid bad or failing directory
nodes, but without querying the same directory node twice.
The downside is that a client that has requested a descriptor without
success, will not be able to find a hidden service that has been started
during the following 15 minutes after the client's last request.
This can be improved by shortening the fetch history to only 5 minutes.
This time should be sufficient to complete requests for all replicas of a
descriptor, but without ending in an infinite request loop.
Compatibility:
All proposed improvements are compatible to the currently implemented
design as described in proposal 114.

View File

@ -1,165 +0,0 @@
Filename: 144-enforce-distinct-providers.txt
Title: Increase the diversity of circuits by detecting nodes belonging the
same provider
Author: Mfr
Created: 2008-06-15
Status: Draft
Overview:
Increase network security by reducing the capacity of the relay or
ISPs monitoring personally or requisition, a large part of traffic
Tor trying to break circuits privacy. A way to increase the
diversity of circuits without killing the network performance.
Motivation:
Since 2004, Roger an Nick publication about diversity [1], very fast
relays Tor running are focused among an half dozen of providers,
controlling traffic of some dozens of routers [2].
In the same way the generalization of VMs clonables paid by hour,
allowing starting in few minutes and for a small cost, a set of very
high-speed relay whose in a few hours can attract a big traffic that
can be analyzed, increasing the vulnerability of the network.
Whether ISPs or domU providers, these usually have several groups of
IP Class B. Also the restriction in place EnforceDistinctSubnets
automatically excluding IP subnet class B is only partially
effective. By contrast a restriction at the class A will be too
restrictive.
Therefore it seems necessary to consider another approach.
Proposal:
Add a provider control based on AS number added by the router on is
descriptor, controlled by Directories Authorities, and used like the
declarative family field for circuit creating.
Design:
Step 1 :
Add to the router descriptor a provider information get request [4]
by the router itself.
"provider" name NL
'names' is the AS number of the router formated like this:
'ASxxxxxx' where AS is fixed and xxxxxx is the AS number,
left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number
is missing the network A class number is used like that:
'ANxxx' where AN is fixed and xxx is the first 3 digits of
the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set
if it's a local network IP.
If two ORs list one another in their "provider" entries,
then OPs should treat them as a single OR for the purpose
of path selection.
For example, if node A's descriptor contains "provider B",
and node B's descriptor contains "provider A", then node A
and node B should never be used on the same circuit.
Add the regarding config option in torrc
EnforceDistinctProviders set to 1 by default.
Permit building circuits with relays in the same provider
if set to 0.
Regarding to proposal 135 if TestingTorNetwork is set
need to be EnforceDistinctProviders is unset.
Control by Authorities Directories of the AS numbers
The Directories Authority control the AS numbers of the new node
descriptor uploaded.
If an old version is operated by the node this test is
bypassed.
If AS number get by request is different from the
description, router is flagged as non-Valid by the testing
Authority for the voting process.
Step 2 When a ' significant number of nodes' of valid routers are
generating descriptor with provider information.
Add missing provider information get by DNS request
functionality for the circuit user:
During circuit building, computing, OP apply first
family check and EnforceDistinctSubnets directives for
performance, then if provider info is needed and
missing in router descriptor try to get AS provider
info by DNS request [4]. This information could be
DNS cached. AN ( class A number) is never generated
during this process to prevent DNS block problems. If
DNS request fails ignore and continue building
circuit.
Step 3 When the 'whole majority' of valid Tor clients are providing
DNS request.
Older versions are deprecated and mark as no-Valid.
EnforceDistinctProviders replace EnforceDistinctSubnets functionnality.
EnforceDistinctSubnets is removed.
Functionalities deployed in step 2 are removed.
Security implications:
This providermeasure will increase the number of providers
addresses that an attacker must use in order to carry out
traffic analysis.
Compatibility:
The presented protocol does not raise compatibility issues
with current Tor versions. The compatibility is preserved by
implementing this functionality in 3 steps, giving time to
network users to upgrade clients and routers.
Performance and scalability notes:
Provider change for all routers could reduce a little
performance if the circuit to long.
During step 2 Get missing provider information could increase
building path time and should have a time out.
Possible Attacks/Open Issues/Some thinking required:
These proposal seems be compatible with proposal 135 Simplify
Configuration of Private Tor Networks.
This proposal does not resolve multiples AS owners and top
providers traffic monitoring attacks [5].
Unresolved AS number are treated as a Class A network. Perhaps
should be marked as invalid. But there's only fives items on
last check see [2].
Need to define what's a 'significant number of nodes' and
'whole majority' ;-)
References:
[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger
Dingledine.
In the Proceedings of the Workshop on Privacy in the Electronic Society
(WPES 2004), Washington, DC, USA, October 2004
http://freehaven.net/anonbib/#feamster:wpes2004
[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt
[3] see Goodell Tor Exit Page
http://cassandra.eecs.harvard.edu/cgi-bin/exit.py
[4] see the great IP to ASN DNS Tool
http://www.team-cymru.org/Services/ip-to-asn.html
[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by
Steven J. Murdoch and Piotr Zielinski.
In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies
(PET 2007), Ottawa, Canada, June 2007.
http://freehaven.net/anonbib/#murdoch-pet2007
[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690

View File

@ -1,39 +0,0 @@
Filename: 145-newguard-flag.txt
Title: Separate "suitable as a guard" from "suitable as a new guard"
Author: Nick Mathewson
Created: 1-Jul-2008
Status: Open
Target: 0.2.1.x
[This could be obsoleted by proposal 141, which could replace NewGuard
with a Guard weight.]
Overview
Right now, Tor has one flag that clients use both to tell which
nodes should be kept as guards, and which nodes should be picked
when choosing new guards. This proposal separates this flag into
two.
Motivation
Balancing clients amoung guards is not done well by our current
algorithm. When a new guard appears, it is chosen by clients
looking for a new guard with the same probability as all existing
guards... but new guards are likelier to be under capacity, whereas
old guards are likelier to be under more use.
Implementation
We add a new flag, NewGuard. Clients will change so that when they
are choosing new guards, they only consider nodes with the NewGuard
flag set.
For now, authorities will always set NewGuard if they are setting
the Guard flag. Later, it will be easy to migrate authorities to
set NewGuard for underused guards.
Alternatives
We might instead have authorities list weights with which nodes
should be picked as guards.

View File

@ -1,84 +0,0 @@
Filename: 146-long-term-stability.txt
Title: Add new flag to reflect long-term stability
Author: Nick Mathewson
Created: 19-Jun-2008
Status: Open
Target: 0.2.1.x
Overview
This document proposes a new flag to indicate that a router has
existed at the same address for a long time, describes how to
implement it, and explains what it's good for.
Motivation
Tor has had three notions of "stability" for servers. Older
directory protocols based a server's stability on its
(self-reported) uptime: a server that had been running for a day was
more stable than a server that had been running for five minutes,
regardless of their past history. Current directory protocols track
weighted mean time between failure (WMTBF) and weighted fractional
uptime (WFU). WFU is computed as the fraction of time for which the
server is running, with measurements weighted to exponentially
decay such that old days count less. WMTBF is computed as the
average length of intervals for which the server runs between
downtime, with old intervals weighted to count less.
WMTBF is useful in answering the question: "If a server is running
now, how long is it likely to stay running?" This makes it a good
choice for picking servers for streams that need to be long-lived.
WFU is useful in answering the question: "If I try connecting to
this server at an arbitrary time, is it likely to be running?" This
makes it an important factor for picking guard nodes, since we want
guard nodes to be usually-up.
There are other questions that clients want to answer, however, for
which the current flags aren't very useful. The one that this
proposal addresses is,
"If I found this server in an old consensus, is it likely to
still be running at the same address?"
This one is useful when we're trying to find directory mirrors in a
fallback-consensus file. This property is equivalent to,
"If I find this server in a current consensus, how long is it
likely to exist on the network?"
This one is useful if we're trying to pick introduction points or
something and care more about churn rate than about whether every IP
will be up all the time.
Implementation:
I propose we add a new flag, called "Longterm." Authorities should
set this flag for routers if their Longevity is in the upper
quartile of all routers. A router's Longevity is computed as the
total amount of days in the last year or so[*] for which the router has
been Running at least once at its current IP:orport pair.
Clients should use directory servers from a fallback-consensus only
if they have the Longterm flag set.
Authority ops should be able to mark particular routers as not
Longterm, regardless of history. (For instance, it makes sense to
remove the Longterm flag from a router whose op says that it will
need to shutdown in a month.)
[*] This is deliberately vague, to permit efficient implementations.
Compatibility and migration issues:
The voting protocol already acts gracefully when new flags are
added, so no change to the voting protocol is needed.
Tor won't have collected this data, however. It might be desirable
to bootstrap it from historical consensuses. Alternatively, we can
just let the algorithm run for a month or two.
Issues and future possibilities:
Longterm is a really awkward name.

View File

@ -1,58 +0,0 @@
Filename: 147-prevoting-opinions.txt
Title: Eliminate the need for v2 directories in generating v3 directories
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Accepted
Target: 0.2.1.x
Overview
We propose a new v3 vote document type to replace the role of v2
networkstatus information in generating v3 consensuses.
Motivation
When authorities vote on which descriptors are to be listed in the
next consensus, it helps if they all know about the same descriptors
as one another. But a hostile, confused, or out-of-date server may
upload a descriptor to only some authorities. In the current v3
directory design, the authorities don't have a good way to tell one
another about the new descriptor until they exchange votes... but by
the time this happens, they are already committed to their votes,
and they can't add anybody they learn about from other authorities
until the next voting cycle. That's no good!
The current Tor implementation avoids this problem by having
authorities also look at v2 networkstatus documents, but we'd like
in the long term to eliminate these, once 0.1.2.x is obsolete.
Design:
We add a new value for vote-status in v3 consensus documents in
addition to "consensus" and "vote": "opinion". Authorities generate
and sign an opinion document as if they were generating a vote,
except that they generate opinions earlier than they generate votes.
Authorities don't need to generate more than one opinion document
per voting interval, but may. They should send it to the other
authorities they know about, at the regular vote upload URL, before
the authorities begin voting, so that enough time remains for the
authorities to fetch new descriptors.
Additionally, authories make their opinions available at
http://<hostname>/tor/status-vote/next/opinion.z
and download opinions from authorities they haven't heard from in a
while.
Authorities MAY generate opinions on demand.
Upon receiving an opinion document, authorities scan it for any
descriptors that:
- They might accept.
- Are for routers they don't know about, or are published more
recently than any descriptor they have for that router.
Authorities then begin downloading such descriptors from authorities
that claim to have them.
Authorities MAY cache opinion documents, but don't need to.

View File

@ -1,57 +0,0 @@
Filename: 148-uniform-client-end-reason.txt
Title: Stream end reasons from the client side should be uniform
Author: Roger Dingledine
Created: 2-Jul-2008
Status: Closed
Implemented-In: 0.2.1.9-alpha
Overview
When a stream closes before it's finished, the end relay cell that's
sent includes an "end stream reason" to tell the other end why it
closed. It's useful for the exit relay to send a reason to the client,
so the client can choose a different circuit, inform the user, etc. But
there's no reason to include it from the client to the exit relay,
and in some cases it can even harm anonymity.
We should pick a single reason for the client-to-exit-relay direction
and always just send that.
Motivation
Back when I first deployed the Tor network, it was useful to have
the Tor relays learn why a stream closed, so I could debug both ends
of the stream at once. Now that streams have worked for many years,
there's no need to continue telling the exit relay whether the client
gave up on a stream because of "timeout" or "misc" or what.
Then in Tor 0.2.0.28-rc, I fixed this bug:
- Fix a bug where, when we were choosing the 'end stream reason' to
put in our relay end cell that we send to the exit relay, Tor
clients on Windows were sometimes sending the wrong 'reason'. The
anonymity problem is that exit relays may be able to guess whether
the client is running Windows, thus helping partition the anonymity
set. Down the road we should stop sending reasons to exit relays,
or otherwise prevent future versions of this bug.
It turned out that non-Windows clients were choosing their reason
correctly, whereas Windows clients were potentially looking at errno
wrong and so always choosing 'misc'.
I fixed that particular bug, but I think we should prevent future
versions of the bug too.
(We already fixed it so *circuit* end reasons don't get sent from
the client to the exit relay. But we appear to be have skipped over
stream end reasons thus far.)
Design:
One option would be to no longer include any 'reason' field in end
relay cells. But that would introduce a partitioning attack ("users
running the old version" vs "users running the new version").
Instead I suggest that clients all switch to sending the "misc" reason,
like most of the Windows clients currently do and like the non-Windows
clients already do sometimes.

View File

@ -1,42 +0,0 @@
Filename: 149-using-netinfo-data.txt
Title: Using data from NETINFO cells
Author: Nick Mathewson
Created: 2-Jul-2008
Status: Open
Target: 0.2.1.x
Overview
Current Tor versions send signed IP and timestamp information in
NETINFO cells, but don't use them to their fullest. This proposal
describes how they should start using this info in 0.2.1.x.
Motivation
Our directory system relies on clients and routers having
reasonably accurate clocks to detect replayed directory info, and
to set accurate timestamps on directory info they publish
themselves. NETINFO cells contain timestamps.
Also, the directory system relies on routers having a reasonable
idea of their own IP addresses, so they can publish correct
descriptors. This is also in NETINFO cells.
Learning the time and IP address
We need to think about attackers here. Just because a router tells
us that we have a given IP or a given clock skew doesn't mean that
it's true. We believe this information only if we've heard it from
a majority of the routers we've connected to recently, including at
least 3 routers. Routers only believe this information if the
majority includes at least one authority.
Avoiding MITM attacks
Current Tors use the IP addresses published in the other router's
NETINFO cells to see whether the connection is "canonical". Right
now, we prefer to extend circuits over "canonical" connections. In
0.2.1.x, we should refuse to extend circuits over non-canonical
connections without first trying to build a canonical one.

View File

@ -1,47 +0,0 @@
Filename: 150-exclude-exit-nodes.txt
Title: Exclude Exit Nodes from a circuit
Author: Mfr
Created: 2008-06-15
Status: Closed
Implemented-In: 0.2.1.3-alpha
Overview
Right now, Tor users can manually exclude a node from all positions
in their circuits created using the directive ExcludeNodes.
This proposal makes this exclusion less restrictive, allowing users to
exclude a node only from the exit part of a circuit.
Motivation
This feature would Help the integration into vidalia (tor exit
branch) or other tools, of features to exclude a country for exit
without reducing circuits possibilities, and privacy. This feature
could help people from a country were many sites are blocked to
exclude this country for browsing, giving them a more stable
navigation. It could also add the possibility for the user to
exclude a currently used exit node.
Implementation
ExcludeExitNodes is similar to ExcludeNodes except it's only
the exit node which is excluded for circuit build.
Tor doesn't warn if node from this list is not an exit node.
Security implications:
Open also possibilities for a future user bad exit reporting
Risks:
Use of this option can make users partitionable under certain attack
assumptions. However, ExitNodes already creates this possibility,
so there isn't much increased risk in ExcludeExitNodes.
We should still encourage people who exclude an exit node because
of bad behavior to report it instead of just adding it to their
ExcludeExit list. It would be unfortunate if we didn't find out
about broken exits because of this option. This issue can probably
be addressed sufficiently with documentation.

View File

@ -1,148 +0,0 @@
Filename: 151-path-selection-improvements.txt
Title: Improving Tor Path Selection
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
Status: Finished
In-Spec: path-spec.txt
Overview
The performance of paths selected can be improved by adjusting the
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
describes a method of tracking buildtime statistics at the client, and
using those statistics to adjust the CircuitBuildTimeout.
Motivation
Tor's performance can be improved by excluding those circuits that
have long buildtimes (and by extension, high latency). For those Tor
users who require better performance and have lower requirements for
anonymity, this would be a very useful option to have.
Implementation
Gathering Build Times
Circuit build times are stored in the circular array
'circuit_build_times' consisting of uint32_t elements as milliseconds.
The total size of this array is based on the number of circuits
it takes to converge on a good fit of the long term distribution of
the circuit builds for a fixed link. We do not want this value to be
too large, because it will make it difficult for clients to adapt to
moving between different links.
From our observations, the minimum value for a reasonable fit appears
to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
a good fit over the long term, we store 5000 most recent circuits in
the array (NCIRCUITS_TO_OBSERVE).
The Tor client will build test circuits at a rate of one per
minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
a CircuitBuildTimeout estimated within 8 hours after install,
upgrade, or network change (see below).
Long Term Storage
The long-term storage representation is implemented by storing a
histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
writing out the statistics to disk. The format this takes in the
state file is 'CircuitBuildTime <bin-ms> <count>', with the total
specified as 'TotalBuildTimes <total>'
Example:
TotalBuildTimes 100
CircuitBuildTimeBin 25 50
CircuitBuildTimeBin 75 25
CircuitBuildTimeBin 125 13
...
Reading the histogram in will entail inserting <count> values
into the circuit_build_times array each with the value of
<bin-ms> milliseconds. In order to evenly distribute the values
in the circular array, the Fisher-Yates shuffle will be performed
after reading values from the bins.
Learning the CircuitBuildTimeout
Based on studies of build times, we found that the distribution of
circuit buildtimes appears to be a Frechet distribution. However,
estimators and quantile functions of the Frechet distribution are
difficult to work with and slow to converge. So instead, since we
are only interested in the accuracy of the tail, we approximate
the tail of the distribution with a Pareto curve starting at
the mode of the circuit build time sample set.
We will calculate the parameters for a Pareto distribution
fitting the data using the estimators at
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
The timeout itself is calculated by using the Quartile function (the
inverted CDF) to give us the value on the CDF such that
BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
below the timeout value.
Thus, we expect that the Tor client will accept the fastest 80% of
the total number of paths on the network.
Detecting Changing Network Conditions
We attempt to detect both network connectivity loss and drastic
changes in the timeout characteristics.
We assume that we've had network connectivity loss if 3 circuits
timeout and we've received no cells or TLS handshakes since those
circuits began. We then set the timeout to 60 seconds and stop
counting timeouts.
If 3 more circuits timeout and the network still has not been
live within this new 60 second timeout window, we then discard
the previous timeouts during this period from our history.
To detect changing network conditions, we keep a history of
the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
that successfully completed at least one hop. If more than 75%
of these circuits timeout, we discard all buildtimes history,
reset the timeout to 60, and then begin recomputing the timeout.
Testing
After circuit build times, storage, and learning are implemented,
the resulting histogram should be checked for consistency by
verifying it persists across successive Tor invocations where
no circuits are built. In addition, we can also use the existing
buildtime scripts to record build times, and verify that the histogram
the python produces matches that which is output to the state file in Tor,
and verify that the Pareto parameters and cutoff points also match.
We will also verify that there are no unexpected large deviations from
node selection, such as nodes from distant geographical locations being
completely excluded.
Dealing with Timeouts
Timeouts should be counted as the expectation of the region of
of the Pareto distribution beyond the cutoff. This is done by
generating a random sample for each timeout at points on the
curve beyond the current timeout cutoff.
Future Work
At some point, it may be desirable to change the cutoff from a
single hard cutoff that destroys the circuit to a soft cutoff and
a hard cutoff, where the soft cutoff merely triggers the building
of a new circuit, and the hard cutoff triggers destruction of the
circuit.
It may also be beneficial to learn separate timeouts for each
guard node, as they will have slightly different distributions.
This will take longer to generate initial values though.
Issues
Impact on anonymity
Since this follows a Pareto distribution, large reductions on the
timeout can be achieved without cutting off a great number of the
total paths. This will eliminate a great deal of the performance
variation of Tor usage.

View File

@ -1,62 +0,0 @@
Filename: 152-single-hop-circuits.txt
Title: Optionally allow exit from single-hop circuits
Author: Geoff Goodell
Created: 13-Jul-2008
Status: Closed
Implemented-In: 0.2.1.6-alpha
Overview
Provide a special configuration option that adds a line to descriptors
indicating that a router can be used as an exit for one-hop circuits,
and allow clients to attach streams to one-hop circuits provided
that the descriptor for the router in the circuit includes this
configuration option.
Motivation
At some point, code was added to restrict the attachment of streams
to one-hop circuits.
The idea seems to be that we can use the cost of forking and
maintaining a patch as a lever to prevent people from writing
controllers that jeopardize the operational security of routers
and the anonymity properties of the Tor network by creating and
using one-hop circuits rather than the standard three-hop circuits.
It may be, for example, that some users do not actually seek true
anonymity but simply reachability through network perspectives
afforded by the Tor network, and since anonymity is stronger in
numbers, forcing users to contribute to anonymity and decrease the
risk to server operators by using full-length paths may be reasonable.
As presently implemented, the sweeping restriction of one-hop circuits
for all routers limits the usefulness of Tor as a general-purpose
technology for building circuits. In particular, we should allow
for controllers, such as Blossom, that create and use single-hop
circuits involving routers that are not part of the Tor network.
Design
Introduce a configuration option for Tor servers that, when set,
indicates that a router is willing to provide exit from one-hop
circuits. Routers with this policy will not require that a circuit
has at least two hops when it is used as an exit.
In addition, routers for which this configuration option
has been set will have a line in their descriptors, "opt
exit-from-single-hop-circuits". Clients will keep track of which
routers have this option and allow streams to be attached to
single-hop circuits that include such routers.
Security Considerations
This approach seems to eliminate the worry about operational router
security, since server operators will not set the configuraiton
option unless they are willing to take on such risk.
To reduce the impact on anonymity of the network resulting
from including such "risky" routers in regular Tor path
selection, clients may systematically exclude routers with "opt
exit-from-single-hop-circuits" when choosing random paths through
the Tor network.

View File

@ -1,175 +0,0 @@
Filename: 153-automatic-software-update-protocol.txt
Title: Automatic software update protocol
Author: Jacob Appelbaum
Created: 14-July-2008
Status: Superseded
[Superseded by thandy-spec.txt]
Automatic Software Update Protocol Proposal
0.0 Introduction
The Tor project and its users require a robust method to update shipped
software bundles. The software bundles often includes Vidalia, Privoxy, Polipo,
Torbutton and of course Tor itself. It is not inconcievable that an update
could include all of the Tor Browser Bundle. It seems reasonable to make this
a standalone program that can be called in shell scripts, cronjobs or by
various Tor controllers.
0.1 Minimal Tasks To Implement Automatic Updating
At the most minimal, an update must be able to do the following:
0 - Detect the curent Tor version, note the working status of Tor.
1 - Detect the latest Tor version.
2 - Fetch the latest version in the form of a platform specific package(s).
3 - Verify the itegrity of the downloaded package(s).
4 - Install the verified package(s).
5 - Test that the new package(s) works properly.
0.2 Specific Enumeration Of Minimal Tasks
To implement requirement 0, we need to detect the current Tor version of both
the updater and the current running Tor. The update program itself should be
versioned internally. This requirement should also test connecting through Tor
itself and note if such connections are possible.
To implement requirement 1, we need to learn the concensus from the directory
authorities or fail back to a known good URL with cryptographically signed
content.
To implement requirement 2, we need to download Tor - hopefully over Tor.
To implement requirement 3, we need to verify the package signature.
To implement requirement 4, we need to use a platform specific method of
installation. The Tor controller performing the update perform these platform
specific methods.
To implement requirement 5, we need to be able to extend circuits and reach
the internet through Tor.
0.x Implementation Goals
The update system will be cross platform and rely on as little external code
as possible. If the update system uses it, it must be updated by the update
system itself. It will consist only of free software and will not rely on any
non-free components until the actual installation phase. If a package manager
is in use, it will be platform specific and thus only invoked by the update
system implementing the update protocol.
The update system itself will attempt to perform update related network
activity over Tor. Possibly it will attempt to use a hidden service first.
It will attempt to use novel and not so novel caching
when possible, it will always verify cryptographic signatures before any
remotely fetched code is executed. In the event of an unusable Tor system,
it will be able to attempt to fetch updates without Tor. This should be user
configurable, some users will be unwilling to update without the protection of
using Tor - others will simply be unable because of blocking of the main Tor
website.
The update system will track current version numbers of Tor and supporting
software. The update system will also track known working versions to assist
with automatic The update system itself will be a standalone library. It will be
strongly versioned internally to match the Tor bundle it was shiped with. The
update system will keep track of the given platform, cpu architecture, lsb_release,
package management functionality and any other platform specific metadata.
We have referenced two popular automatic update systems, though neither fit
our needs, both are useful as an idea of what others are doing in the same
area.
The first is sparkle[0] but it is sadly only available for Cocoa
environments and is written in Objective C. This doesn't meet our requirements
because it is directly tied into the private Apple framework.
The second is the Mozilla Automatic Update System[1]. It is possibly useful
as an idea of how other free software projects automatically update. It is
however not useful in its currently documented form.
[0] http://sparkle.andymatuschak.org/documentation/
[1] http://wiki.mozilla.org/AUS:Manual
0.x Previous methods of Tor and related software update
Previously, Tor users updated their Tor related software by hand. There has
been no fully automatic method for any user to update. In addition, there
hasn't been any specific way to find out the most current stable version of Tor
or related software as voted on by the directory authority concensus.
0.x Changes to the directory specification
We will want to supplement client-versions and server-versions in the
concensus voting with another version identifier known as
'auto-update-versions'. This will keep track of the current concensus of
specific versions that are best per platform and per architecture. It should
be noted that while the Mac OS X universal binary may be the best for x86
processers with Tiger, it may not be the best for PPC users on Panther. This
goes for all of the package updates. We want to prevent updates that cause Tor
to break even if the updating program can recover gracefully.
x.x Assumptions About Operating System Package Management
It is assumed that users will use their package manager unless they are on
Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows
users will have integration with the normal "add/remove program" functionality
that said users would expect.
x.x Package Update System Failure Modes
The package update will try to ensure that a user always has a working Tor at
the very least. It will keep state to remember versions of Tor that were able
to bootstrap properly and reach the rest of the Tor network. It will also keep
note of which versions broke. It will select the best Tor that works for the
user. It will also allow for anonymized bug reporting on the packages
available and tested by the auto-update system.
x.x Package Signature Verification
The update system will be aware of replay attacks against the update signature
system itself. It will not allow package update signatures that are radically
out of date. It will be a multi-key system to prevent any single party from
forging an update. The key will be updated regularly. This is like authority
key (see proposal 103) usage.
x.x Package Caching
The update system will iterate over different update methods. Whichever method
is picked will have caching functionality. Each Tor server itself should be
able to serve cached update files. This will be an option that friendly server
administrators can turn on should they wish to support caching. In addition,
it is possible to cache the full contents of a package in an
authoratative DNS zone. Users can then query the DNS zone for their package.
If we wish to further distribute the update load, we can also offer packages
with encrypted bittorrent. Clients who wish to share the updates but do not
wish to be a server can help distribute Tor updates. This can be tied together
with the DNS caching[2][3] if needed.
[2] http://www.netrogenic.com/dnstorrent/
[3] http://www.doxpara.com/ozymandns_src_0.1.tgz
x.x Helping Our Users Spread Tor
There should be a way for a user to participate in the packaging caching as
described in section x.x. This option should be presented by the Tor
controller.
x.x Simple HTTP Proxy To The Tor Project Website
It has been suggested that we should provide a simple proxy that allows a user
to visit the main Tor website to download packages. This was part of a
previous proposal and has not been closely examined.
x.x Package Installation
Platform specific methods for proper package installation will be left to the
controller that is calling for an update. Each platform is different, the
installation options and user interface will be specific to the controller in
question.
x.x Other Things
Other things should be added to this proposal. What are they?

View File

@ -1,377 +0,0 @@
Filename: 154-automatic-updates.txt
Title: Automatic Software Update Protocol
Author: Matt Edman
Created: 30-July-2008
Status: Superseded
Target: 0.2.1.x
Superseded by thandy-spec.txt
Scope
This proposal specifies the method by which an automatic update client can
determine the most recent recommended Tor installation package for the
user's platform, download the package, and then verify that the package was
downloaded successfully. While this proposal focuses on only the Tor
software, the protocol defined is sufficiently extensible such that other
components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be
managed and updated by the automatic update client as well.
The initial target platform for the automatic update framework is Windows,
given that's the platform used by a majority of our users and that it lacks
a sane package management system that many Linux distributions already have.
Our second target platform will be Mac OS X, and so the protocol will be
designed with this near-future direction in mind.
Other client-side aspects of the automatic update process, such as user
interaction, the interface presented, and actual package installation
procedure, are outside the scope of this proposal.
Motivation
Tor releases new versions frequently, often with important security,
anonymity, and stability fixes. Thus, it is important for users to be able
to promptly recognize when new versions are available and to easily
download, authenticate, and install updated Tor and Tor-related software
packages.
Tor's control protocol [2] provides a method by which controllers can
identify when the user's Tor software is obsolete or otherwise no longer
recommended. Currently, however, no mechanism exists for clients to
automatically download and install updated Tor and Tor-related software for
the user.
Design Overview
The core of the automatic update framework is a well-defined file called a
"recommended-packages" file. The recommended-packages file is accessible via
HTTP[S] at one or more well-defined URLs. An example recommended-packages
URL may be:
https://updates.torproject.org/recommended-packages
The recommended-packages document is formatted according to Section 1.2
below and specifies the most recent recommended installation package
versions for Tor or Tor-related software, as well as URLs at which the
packages and their signatures can be downloaded.
An automatic update client process runs on the Tor user's computer and
periodically retrieves the recommended-packages file according to the method
described in Section 2.0. As described further in Section 1.2, the
recommended-packages file is signed and can be verified by the automatic
update client with one or more public keys included in the client software.
Since it is signed, the recommended-packages file can be mirrored by
multiple hosts (e.g., Tor directory authorities), whose URLs are included in
the automatic update client's configuration.
After retrieving and verifying the recommended-packages file, the automatic
update client compares the versions of the recommended software packages
listed in the file with those currently installed on the end-user's
computer. If one or more of the installed packages is determined to be out
of date, an updated package and its signature will be downloaded from one of
the package URLs listed in the recommended-packages file as described in
Section 2.2.
The automatic update system uses a multilevel signing key scheme for package
signatures. There are a small number of entities we call "packaging
authorities" that each have their own signing key. A packaging authority is
responsible for signing and publishing the recommended-packages file.
Additionally, each individual packager responsible for producing an
installation package for one or more platforms has their own signing key.
Every packager's signing key must be signed by at least one of the packaging
authority keys.
Specification
1. recommended-packages Specification
In this section we formally specify the format of the published
recommended-packages file.
1.1. Document Meta-format
The recommended-packages document follows the lightweight extensible
information format defined in Tor's directory protocol specification [1]. In
the interest of self-containment, we have reproduced the relevant portions
of that format's specification in this Section. (Credits to Nick Mathewson
for much of the original format definition language.)
The highest level object is a Document, which consists of one or more
Items. Every Item begins with a KeywordLine, followed by zero or more
Objects. A KeywordLine begins with a Keyword, optionally followed by
whitespace and more non-newline characters, and ends with a newline. A
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
An Object is a block of encoded data in pseudo-Open-PGP-style
armor. (cf. RFC 2440)
More formally:
Document ::= (Item | NL)+
Item ::= KeywordLine Object*
KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL
Keyword ::= KeywordChar+
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
ArgumentChar ::= any printing ASCII character except NL.
WS ::= (SP | TAB)+
Object ::= BeginLine Base-64-encoded-data EndLine
BeginLine ::= "-----BEGIN " Keyword "-----" NL
EndLine ::= "-----END " Keyword "-----" NL
The BeginLine and EndLine of an Object must use the same keyword.
In our Document description below, we also tag Items with a multiplicity in
brackets. Possible tags are:
"At start, exactly once": These items MUST occur in every instance of the
document type, and MUST appear exactly once, and MUST be the first item in
their documents.
"Exactly once": These items MUST occur exactly one time in every
instance of the document type.
"Once or more": These items MUST occur at least once in any instance
of the document type, and MAY occur more than once.
"At end, exactly once": These items MUST occur in every instance of
the document type, and MUST appear exactly once, and MUST be the
last item in their documents.
1.2. recommended-packages Document Format
When interpreting a recommended-packages Document, software MUST ignore
any KeywordLine that starts with a keyword it doesn't recognize; future
implementations MUST NOT require current automatic update clients to
understand any KeywordLine not currently described.
In lines that take multiple arguments, extra arguments SHOULD be
accepted and ignored.
The currently defined Items contained in a recommended-packages document
are:
"recommended-packages-format" SP number NL
[Exactly once]
This Item specifies the version of the recommended-packages format that
is contained in the subsequent document. The version defined in this
proposal is version "1". Subsequent iterations of this protocol MUST
increment this value if they introduce incompatible changes to the
document format and MAY increment this value if they only introduce
additional Keywords.
"published" SP YYYY-MM-DD SP HH:MM:SS NL
[Exactly once]
The time, in GMT, when this recommended-packages document was generated.
Automatic update clients SHOULD ignore Documents over 60 days old.
"tor-stable-win32-version" SP TorVersion NL
[Exactly once]
This keyword specifies the latest recommended release of Tor's "stable"
branch for the Windows platform that has an installation package
available. Note that this version does not necessarily correspond to the
most recently tagged stable Tor version, since that version may not yet
have an installer package available, or may have known issues on
Windows.
The TorVersion field is formatted according to Section 2 of Tor's
version specification [3].
"tor-stable-win32-package" SP Url NL
[Once or more]
This Item specifies the location from which the most recent
recommended Windows installation package for Tor's stable branch can be
downloaded.
When this Item appears multiple times within the Document, automatic
update clients SHOULD select randomly from the available package
mirrors.
"tor-dev-win32-version" SP TorVersion NL
[Exactly once]
This Item specifies the latest recommended release of Tor's
"development" branch for the Windows platform that has an installation
package available. The same caveats from the description of
"tor-stable-win32-version" also apply to this keyword.
The TorVersion field is formatted according to Section 2 of Tor's
version specification [3].
"tor-dev-win32-package" SP Url NL
[Once or more]
This Item specifies the location from which the most recent recommended
Windows installation package and its signature for Tor's development
branch can be downloaded.
When this Keyword appears multiple times within the Document, automatic
update clients SHOULD select randomly from the available package
mirrors.
"signature" NL SIGNATURE NL
[At end, exactly once]
The "SIGNATURE" Object contains a PGP signature (using a packaging
authority signing key) of the entire document, taken from the beginning
of the "recommended-packages-format" keyword, through the newline after
the "signature" Keyword.
2. Automatic Update Client Behavior
The client-side component of the automatic update framework is an
application that runs on the end-user's machine. It is responsible for
fetching and verifying a recommended-packages document, as well as
downloading, verifying, and subsequently installing any necessary updated
software packages.
2.1. Download and verify a recommended-packages document
The first step in the automatic update process is for the client to download
a copy of the recommended-packages file. The automatic update client
contains a (hardcoded and/or user-configurable) list of URLs from which it
will attempt to retrieve a recommended-packages file.
Connections to each of the recommended-packages URLs SHOULD be attempted in
the following order:
1) HTTPS over Tor
2) HTTP over Tor
3) Direct HTTPS
4) Direct HTTP
If the client fails to retrieve a recommended-packages document via any of
the above connection methods from any of the configured URLs, the client
SHOULD retry its download attempts following an exponential back-off
algorithm. After the first failed attempt, the client SHOULD delay one hour
before attempting again, up to a maximum of 24 hours delay between retry
attempts.
After successfully downloading a recommended-packages file, the automatic
update client will verify the signature using one of the public keys
distributed with the client software. If more than one recommended-packages
file is downloaded and verified, the file with the most recent "published"
date that is verified will be retained and the rest discarded.
2.2. Download and verify the updated packages
The automatic update client next compares the latest recommended package
version from the recommended-packages document with the currently installed
Tor version. If the user currently has installed a Tor version from Tor's
"development" branch, then the version specified in "tor-dev-*-version" Item
is used for comparison. Similarly, if the user currently has installed a Tor
version from Tor's "stable" branch, then the version specified in the
"tor-stable-*version" Item is used for comparison. Version comparisons are
done according to Tor's version specification [3].
If the automatic update client determines an installation package newer than
the user's currently installed version is available, it will attempt to
download a package appropriate for the user's platform and Tor branch from a
URL specified by a "tor-[branch]-[platform]-package" Item. If more than one
mirror for the selected package is available, a mirror will be chosen at
random from all those available.
The automatic update client must also download a ".asc" signature file for
the retrieved package. The URL for the package signature is the same as that
for the package itself, except with the extension ".asc" appended to the
package URL.
Connections to download the updated package and its signature SHOULD be
attempted in the same order described in Section 2.1.
After completing the steps described in Sections 2.1 and 2.2, the automatic
update client will have downloaded and verified a copy of the latest Tor
installation package. It can then take whatever subsequent platform-specific
steps are necessary to install the downloaded software updates.
2.3. Periodic checking for updates
The automatic update client SHOULD maintain a local state file in which it
records (at a minimum) the timestamp at which it last retrieved a
recommended-packages file and the timestamp at which the client last
successfully downloaded and installed a software update.
Automatic update clients SHOULD check for an updated recommended-packages
document at most once per day but at least once every 30 days.
3. Future Extensions
There are several possible areas for future extensions of this framework.
The extensions below are merely suggestions and should be the subject of
their own proposal before being implemented.
3.1. Additional Software Updates
There are several software packages often included in Tor bundles besides
Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and
download locations of updated installation packages for these bundle
components can be easily added to the recommended-packages document
specification above.
3.2. Including ChangeLog Information
It may be useful for automatic update clients to be able to display for
users a summary of the changes made in the latest Tor or Tor-related
software release, before the user chooses to install the update. In the
future, we can add keywords to the specification in Section 1.2 that specify
the location of a ChangeLog file for the latest recommended package
versions. It may also be desirable to allow localized ChangeLog information,
so that the automatic update client can fetch release notes in the
end-user's preferred language.
3.3. Weighted Package Mirror Selection
We defined in Section 1.2 a method by which automatic update clients can
select from multiple available package mirrors. We may want to add a Weight
argument to the "*-package" Items that allows the recommended-packages file
to suggest to clients the probability with which a package mirror should be
chosen. This will allow clients to more appropriately distribute package
downloads across available mirrors proportional to their approximate
bandwidth.
Implementation
Implementation of this proposal will consist of two separate components.
The first component is a small "au-publish" tool that takes as input a
configuration file specifying the information described in Section 1.2 and a
private key. The tool is run by a "packaging authority" (someone responsible
for publishing updated installation packages), who will be prompted to enter
the passphrase for the private key used to sign the recommended-packages
document. The output of the tool is a document formatted according to
Section 1.2, with a signature appended at the end. The resulting document
can then be published to any of the update mirrors.
The second component is an "au-client" tool that is run on the end-user's
machine. It periodically checks for updated installation packages according
to Section 2 and fetches the packages if necessary. The public keys used
to sign the recommended-packages file and any of the published packages are
included in the "au-client" tool.
References
[1] Tor directory protocol (version 3),
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt
[2] Tor control protocol (version 2),
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt
[3] Tor version specification,
https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt

View File

@ -1,120 +0,0 @@
Filename: 155-four-hidden-service-improvements.txt
Title: Four Improvements of Hidden Service Performance
Author: Karsten Loesing, Christian Wilms
Created: 25-Sep-2008
Status: Finished
Implemented-In: 0.2.1.x
Change history:
25-Sep-2008 Initial proposal for or-dev
Overview:
A performance analysis of hidden services [1] has brought up a few
possible design changes to reduce advertisement time of a hidden service
in the network as well as connection establishment time. Some of these
design changes have side-effects on anonymity or overall network load
which had to be weighed up against individual performance gains. A
discussion of seven possible design changes [2] has led to a selection
of four changes [3] that are proposed to be implemented here.
Design:
1. Shorter Circuit Extension Timeout
When establishing a connection to a hidden service a client cannibalizes
an existing circuit and extends it by one hop to one of the service's
introduction points. In most cases this can be accomplished within a few
seconds. Therefore, the current timeout of 60 seconds for extending a
circuit is far too high.
Assuming that the timeout would be reduced to a lower value, for example
30 seconds, a second (or third) attempt to cannibalize and extend would
be started earlier. With the current timeout of 60 seconds, 93.42% of all
circuits can be established, whereas this fraction would have been only
0.87% smaller at 92.55% with a timeout of 30 seconds.
For a timeout of 30 seconds the performance gain would be approximately 2
seconds in the mean as opposed to the current timeout of 60 seconds. At
the same time a smaller timeout leads to discarding an increasing number
of circuits that might have been completed within the current timeout of
60 seconds.
Measurements with simulated low-bandwidth connectivity have shown that
there is no significant effect of client connectivity on circuit
extension times. The reason for this might be that extension messages are
small and thereby independent of the client bandwidth. Further, the
connection between client and entry node only constitutes a single hop of
a circuit, so that its influence on the whole circuit is limited.
The exact value of the new timeout does not necessarily have to be 30
seconds, but might also depend on the results of circuit build timeout
measurements as described in proposal 151.
2. Parallel Connections to Introduction Points
An additional approach to accelerate extension of introduction circuits
is to extend a second circuit in parallel to a different introduction
point. Such parallel extension attempts should be started after a short
delay of, e.g., 15 seconds in order to prevent unnecessary circuit
extensions and thereby save network resources. Whichever circuit
extension succeeds first is used for introduction, while the other
attempt is aborted.
An evaluation has been performed for the more resource-intensive approach
of starting two parallel circuits immediately instead of waiting for a
short delay. The result was a reduction of connection establishment times
from 27.4 seconds in the original protocol to 22.5 seconds.
While the effect of the proposed approach of delayed parallelization on
mean connection establishment times is expected to be smaller,
variability of connection attempt times can be reduced significantly.
3. Increase Count of Internal Circuits
Hidden services need to create or cannibalize and extend a circuit to a
rendezvous point for every client request. Really popular hidden services
require more than two internal circuits in the pool to answer multiple
client requests at the same time. This scenario was not yet analyzed, but
will probably exhibit worse performance than measured in the previous
analysis. The number of preemptively built internal circuits should be a
function of connection requests in the past to adapt to changing needs.
Furthermore, an increased number of internal circuits on client side
would allow clients to establish connections to more than one hidden
service at a time.
Under the assumption that a popular hidden service cannot make use of
cannibalization for connecting to rendezvous points, the circuit creation
time needs to be added to the current results. In the mean, the
connection establishment time to a popular hidden service would increase
by 4.7 seconds.
4. Build More Introduction Circuits
When establishing introduction points, a hidden service should launch 5
instead of 3 introduction circuits at the same time and use only the
first 3 that could be established. The remaining two circuits could still
be used for other purposes afterwards.
The effect has been simulated using previously measured data, too.
Therefore, circuit establishment times were derived from log files and
written to an array. Afterwards, a simulation with 10,000 runs was
performed picking 5 (4, 6) random values and using the 3 lowest values in
contrast to picking only 3 values at random. The result is that the mean
time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of
the 3-out-of-5 approach is 4.4 seconds.
The effect on network load is minimal, because the hidden service can
reuse the slower internal circuits for other purposes, e.g., rendezvous
circuits. The only change is that a hidden service starts establishing
more circuits at once instead of subsequently doing so.
References:
[1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf
[2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf
[3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf

View File

@ -1,527 +0,0 @@
Filename: 156-tracking-blocked-ports.txt
Title: Tracking blocked ports on the client side
Author: Robert Hogan
Created: 14-Oct-2008
Status: Open
Target: 0.2.?
Motivation:
Tor clients that are behind extremely restrictive firewalls can end up
waiting a while for their first successful OR connection to a node on the
network. Worse, the more restrictive their firewall the more susceptible
they are to an attacker guessing their entry nodes. Tor routers that
are behind extremely restrictive firewalls can only offer a limited,
'partitioned' service to other routers and clients on the network. Exit
nodes behind extremely restrictive firewalls may advertise ports that they
are actually not able to connect to, wasting network resources in circuit
constructions that are doomed to fail at the last hop on first use.
Proposal:
When a client attempts to connect to an entry guard it should avoid
further attempts on ports that fail once until it has connected to at
least one entry guard successfully. (Maybe it should wait for more than
one failure to reduce the skew on the first node selection.) Thereafter
it should select entry guards regardless of port and warn the user if
it observes that connections to a given port have failed every multiple
of 5 times without success or since the last success.
Tor should warn the operators of exit, middleman and entry nodes if it
observes that connections to a given port have failed a multiple of 5
times without success or since the last success. If attempts on a port
fail 20 or more times without or since success, Tor should add the port
to a 'blocked-ports' entry in its descriptor's extra-info. Some thought
needs to be given to what the authorities might do with this information.
Related TODO item:
"- Automatically determine what ports are reachable and start using
those, if circuits aren't working and it's a pattern we
recognize ("port 443 worked once and port 9001 keeps not
working")."
I've had a go at implementing all of this in the attached.
Addendum:
Just a note on the patch, storing the digest of each router that uses the port
is a bit of a memory hog, and its only real purpose is to provide a count of
routers using that port when warning the user. That could be achieved when
warning the user by iterating through the routerlist instead.
Index: src/or/connection_or.c
===================================================================
--- src/or/connection_or.c (revision 17104)
+++ src/or/connection_or.c (working copy)
@@ -502,6 +502,9 @@
connection_or_connect_failed(or_connection_t *conn,
int reason, const char *msg)
{
+ if ((reason == END_OR_CONN_REASON_NO_ROUTE) ||
+ (reason == END_OR_CONN_REASON_REFUSED))
+ or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port);
control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason);
if (!authdir_mode_tests_reachability(get_options()))
control_event_bootstrap_problem(msg, reason);
@@ -580,6 +583,7 @@
/* already marked for close */
return NULL;
}
+
return conn;
}
@@ -909,6 +913,7 @@
control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0);
if (started_here) {
+ or_port_hist_success(TO_CONN(conn)->port);
rep_hist_note_connect_succeeded(conn->identity_digest, now);
if (entry_guard_register_connect_status(conn->identity_digest,
1, now) < 0) {
Index: src/or/rephist.c
===================================================================
--- src/or/rephist.c (revision 17104)
+++ src/or/rephist.c (working copy)
@@ -18,6 +18,7 @@
static void bw_arrays_init(void);
static void predicted_ports_init(void);
static void hs_usage_init(void);
+static void or_port_hist_init(void);
/** Total number of bytes currently allocated in fields used by rephist.c. */
uint64_t rephist_total_alloc=0;
@@ -89,6 +90,25 @@
digestmap_t *link_history_map;
} or_history_t;
+/** or_port_hist_t contains our router/client's knowledge of
+ all OR ports offered on the network, and how many servers with each port we
+ have succeeded or failed to connect to. */
+typedef struct {
+ /** The port this entry is tracking. */
+ uint16_t or_port;
+ /** Have we ever connected to this port on another OR?. */
+ unsigned int success:1;
+ /** The ORs using this port. */
+ digestmap_t *ids;
+ /** The ORs using this port we have failed to connect to. */
+ digestmap_t *failure_ids;
+ /** Are we excluding ORs with this port during entry selection?*/
+ unsigned int excluded;
+} or_port_hist_t;
+
+static unsigned int still_searching = 0;
+static smartlist_t *or_port_hists;
+
/** When did we last multiply all routers' weighted_run_length and
* total_run_weights by STABILITY_ALPHA? */
static time_t stability_last_downrated = 0;
@@ -164,6 +184,16 @@
tor_free(hist);
}
+/** Helper: free storage held by a single OR port history entry. */
+static void
+or_port_hist_free(or_port_hist_t *p)
+{
+ tor_assert(p);
+ digestmap_free(p->ids,NULL);
+ digestmap_free(p->failure_ids,NULL);
+ tor_free(p);
+}
+
/** Update an or_history_t object <b>hist</b> so that its uptime/downtime
* count is up-to-date as of <b>when</b>.
*/
@@ -1639,7 +1669,7 @@
tmp_time = smartlist_get(predicted_ports_times, i);
if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) {
tmp_port = smartlist_get(predicted_ports_list, i);
- log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port);
+ log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port);
smartlist_del(predicted_ports_list, i);
smartlist_del(predicted_ports_times, i);
rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t);
@@ -1821,6 +1851,12 @@
tor_free(last_stability_doc);
built_last_stability_doc_at = 0;
predicted_ports_free();
+ if (or_port_hists) {
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p,
+ or_port_hist_free(p));
+ smartlist_free(or_port_hists);
+ or_port_hists = NULL;
+ }
}
/****************** hidden service usage statistics ******************/
@@ -2356,3 +2392,225 @@
tor_free(fname);
}
+/** Create a new entry in the port tracking cache for the or_port in
+ * <b>ri</b>. */
+void
+or_port_hist_new(const routerinfo_t *ri)
+{
+ or_port_hist_t *result;
+ const char *id=ri->cache_info.identity_digest;
+
+ if (!or_port_hists)
+ or_port_hist_init();
+
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ /* Cope with routers that change their advertised OR port or are
+ dropped from the networkstatus. We don't discard the failures of
+ dropped routers because they are still valid when counting
+ consecutive failures on a port.*/
+ if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) {
+ digestmap_remove(tp->ids, id);
+ }
+ if (tp->or_port == ri->or_port) {
+ if (!(digestmap_get(tp->ids, id)))
+ digestmap_set(tp->ids, id, (void*)1);
+ return;
+ }
+ });
+
+ result = tor_malloc_zero(sizeof(or_port_hist_t));
+ result->or_port=ri->or_port;
+ result->success=0;
+ result->ids=digestmap_new();
+ digestmap_set(result->ids, id, (void*)1);
+ result->failure_ids=digestmap_new();
+ result->excluded=0;
+ smartlist_add(or_port_hists, result);
+}
+
+/** Create the port tracking cache. */
+/*XXX: need to call this when we rebuild/update our network status */
+static void
+or_port_hist_init(void)
+{
+ routerlist_t *rl = router_get_routerlist();
+
+ if (!or_port_hists)
+ or_port_hists=smartlist_create();
+
+ if (rl && rl->routers) {
+ SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri,
+ {
+ or_port_hist_new(ri);
+ });
+ }
+}
+
+#define NOT_BLOCKED 0
+#define FAILURES_OBSERVED 1
+#define POSSIBLY_BLOCKED 5
+#define PROBABLY_BLOCKED 10
+/** Return the list of blocked ports for our router's extra-info.*/
+char *
+or_port_hist_get_blocked_ports(void)
+{
+ char blocked_ports[2048];
+ char *bp;
+
+ tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports");
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED)
+ tor_snprintf(blocked_ports+strlen(blocked_ports),
+ sizeof(blocked_ports)," %u,",tp->or_port);
+ });
+ if (strlen(blocked_ports) == 13)
+ return NULL;
+ bp=tor_strdup(blocked_ports);
+ bp[strlen(bp)-1]='\n';
+ bp[strlen(bp)]='\0';
+ return bp;
+}
+
+/** Revert to client-only mode if we have seen to many failures on a port or
+ * range of ports.*/
+static void
+or_port_hist_report_block(unsigned int min_severity)
+{
+ or_options_t *options=get_options();
+ char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048];
+ char port[1024];
+
+ memset(failures_observed,0,sizeof(failures_observed));
+ memset(possibly_blocked,0,sizeof(possibly_blocked));
+ memset(probably_blocked,0,sizeof(probably_blocked));
+
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ unsigned int failures = digestmap_size(tp->failure_ids);
+ if (failures >= min_severity) {
+ tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the"
+ " network)",tp->or_port,failures,
+ (!tp->success)?"and no successes": "since last success",
+ digestmap_size(tp->ids));
+ if (failures >= PROBABLY_BLOCKED) {
+ strlcat(probably_blocked, port, sizeof(probably_blocked));
+ } else if (failures >= POSSIBLY_BLOCKED)
+ strlcat(possibly_blocked, port, sizeof(possibly_blocked));
+ else if (failures >= FAILURES_OBSERVED)
+ strlcat(failures_observed, port, sizeof(failures_observed));
+ }
+ });
+
+ log_warn(LD_HIST,"%s%s%s%s%s%s%s%s",
+ server_mode(options) &&
+ ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))?
+ "You should consider disabling your Tor server.":"",
+ (min_severity==FAILURES_OBSERVED)?
+ "Tor appears to be blocked from connecting to a range of ports "
+ "with the result that it cannot connect to one tenth of the Tor "
+ "network. ":"",
+ strlen(failures_observed)?
+ "Tor has observed failures on the following ports: ":"",
+ failures_observed,
+ strlen(possibly_blocked)?
+ "Tor is possibly blocked on the following ports: ":"",
+ possibly_blocked,
+ strlen(probably_blocked)?
+ "Tor is almost certainly blocked on the following ports: ":"",
+ probably_blocked);
+
+}
+
+/** Record the success of our connection to <b>digest</b>'s
+ * OR port. */
+void
+or_port_hist_success(uint16_t or_port)
+{
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ if (tp->or_port != or_port)
+ continue;
+ /*Reset our failure stats so we can notice if this port ever gets
+ blocked again.*/
+ tp->success=1;
+ if (digestmap_size(tp->failure_ids)) {
+ digestmap_free(tp->failure_ids,NULL);
+ tp->failure_ids=digestmap_new();
+ }
+ if (still_searching) {
+ still_searching=0;
+ SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;);
+ }
+ return;
+ });
+}
+/** Record the failure of our connection to <b>digest</b>'s
+ * OR port. Warn, exclude the port from future entry guard selection, or
+ * add port to blocked-ports in our server's extra-info as appropriate. */
+void
+or_port_hist_failure(const char *digest, uint16_t or_port)
+{
+ int total_failures=0, ports_excluded=0, report_block=0;
+ int total_routers=smartlist_len(router_get_routerlist()->routers);
+
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ ports_excluded += tp->excluded;
+ total_failures+=digestmap_size(tp->failure_ids);
+ if (tp->or_port != or_port)
+ continue;
+ /* We're only interested in unique failures */
+ if (digestmap_get(tp->failure_ids, digest))
+ return;
+
+ total_failures++;
+ digestmap_set(tp->failure_ids, digest, (void*)1);
+ if (still_searching && !tp->success) {
+ tp->excluded=1;
+ ports_excluded++;
+ }
+ if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) &&
+ !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED))
+ report_block=POSSIBLY_BLOCKED;
+ });
+
+ if (total_failures >= (int)(total_routers/10))
+ or_port_hist_report_block(FAILURES_OBSERVED);
+ else if (report_block)
+ or_port_hist_report_block(report_block);
+
+ if (ports_excluded >= smartlist_len(or_port_hists)) {
+ log_warn(LD_HIST,"During entry node selection Tor tried every port "
+ "offered on the network on at least one server "
+ "and didn't manage a single "
+ "successful connection. This suggests you are behind an "
+ "extremely restrictive firewall. Tor will keep trying to find "
+ "a reachable entry node.");
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;);
+ }
+}
+
+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */
+void
+or_port_hist_exclude(routerset_t *rt)
+{
+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
+ {
+ char portpolicy[9];
+ if (tp->excluded) {
+ tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port);
+ log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily "
+ "from entry guard selection.", tp->or_port);
+ routerset_parse(rt, portpolicy, "Ports");
+ }
+ });
+}
+
+/** Allow the exclusion of ports during our search for an entry node. */
+void
+or_port_hist_search_again(void)
+{
+ still_searching=1;
+}
Index: src/or/or.h
===================================================================
--- src/or/or.h (revision 17104)
+++ src/or/or.h (working copy)
@@ -3864,6 +3864,13 @@
int any_predicted_circuits(time_t now);
int rep_hist_circbuilding_dormant(time_t now);
+void or_port_hist_failure(const char *digest, uint16_t or_port);
+void or_port_hist_success(uint16_t or_port);
+void or_port_hist_new(const routerinfo_t *ri);
+void or_port_hist_exclude(routerset_t *rt);
+void or_port_hist_search_again(void);
+char *or_port_hist_get_blocked_ports(void);
+
/** Possible public/private key operations in Tor: used to keep track of where
* we're spending our time. */
typedef enum {
Index: src/or/routerparse.c
===================================================================
--- src/or/routerparse.c (revision 17104)
+++ src/or/routerparse.c (working copy)
@@ -1401,6 +1401,8 @@
goto err;
}
+ or_port_hist_new(router);
+
if (!router->platform) {
router->platform = tor_strdup("<unknown>");
}
Index: src/or/router.c
===================================================================
--- src/or/router.c (revision 17104)
+++ src/or/router.c (working copy)
@@ -1818,6 +1818,7 @@
char published[ISO_TIME_LEN+1];
char digest[DIGEST_LEN];
char *bandwidth_usage;
+ char *blocked_ports;
int result;
size_t len;
@@ -1825,7 +1826,6 @@
extrainfo->cache_info.identity_digest, DIGEST_LEN);
format_iso_time(published, extrainfo->cache_info.published_on);
bandwidth_usage = rep_hist_get_bandwidth_lines(1);
-
result = tor_snprintf(s, maxlen,
"extra-info %s %s\n"
"published %s\n%s",
@@ -1835,6 +1835,16 @@
if (result<0)
return -1;
+ blocked_ports = or_port_hist_get_blocked_ports();
+ if (blocked_ports) {
+ result = tor_snprintf(s+strlen(s), maxlen-strlen(s),
+ "%s",
+ blocked_ports);
+ tor_free(blocked_ports);
+ if (result<0)
+ return -1;
+ }
+
if (should_record_bridge_info(options)) {
static time_t last_purged_at = 0;
char *geoip_summary;
Index: src/or/circuitbuild.c
===================================================================
--- src/or/circuitbuild.c (revision 17104)
+++ src/or/circuitbuild.c (working copy)
@@ -62,6 +62,7 @@
static void entry_guards_changed(void);
static time_t start_of_month(time_t when);
+static int num_live_entry_guards(void);
/** Iterate over values of circ_id, starting from conn-\>next_circ_id,
* and with the high bit specified by conn-\>circ_id_type, until we get
@@ -1627,12 +1628,14 @@
smartlist_t *excluded;
or_options_t *options = get_options();
router_crn_flags_t flags = 0;
+ routerset_t *_ExcludeNodes;
if (state && options->UseEntryGuards &&
(purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) {
return choose_random_entry(state);
}
+ _ExcludeNodes = routerset_new();
excluded = smartlist_create();
if (state && (r = build_state_get_exit_router(state))) {
@@ -1670,12 +1673,18 @@
if (options->_AllowInvalid & ALLOW_INVALID_ENTRY)
flags |= CRN_ALLOW_INVALID;
+ if (options->ExcludeNodes)
+ routerset_union(_ExcludeNodes,options->ExcludeNodes);
+
+ or_port_hist_exclude(_ExcludeNodes);
+
choice = router_choose_random_node(
NULL,
excluded,
- options->ExcludeNodes,
+ _ExcludeNodes,
flags);
smartlist_free(excluded);
+ routerset_free(_ExcludeNodes);
return choice;
}
@@ -2727,6 +2736,7 @@
entry_guards_update_state(or_state_t *state)
{
config_line_t **next, *line;
+ unsigned int have_reachable_entry=0;
if (! entry_guards_dirty)
return;
@@ -2740,6 +2750,7 @@
char dbuf[HEX_DIGEST_LEN+1];
if (!e->made_contact)
continue; /* don't write this one to disk */
+ have_reachable_entry=1;
*next = line = tor_malloc_zero(sizeof(config_line_t));
line->key = tor_strdup("EntryGuard");
line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2);
@@ -2785,6 +2796,11 @@
if (!get_options()->AvoidDiskWrites)
or_state_mark_dirty(get_or_state(), 0);
entry_guards_dirty = 0;
+
+ /* XXX: Is this the place to decide that we no longer have any reachable
+ guards? */
+ if (!have_reachable_entry)
+ or_port_hist_search_again();
}
/** If <b>question</b> is the string "entry-guards", then dump

View File

@ -1,102 +0,0 @@
Filename: 157-specific-cert-download.txt
Title: Make certificate downloads specific
Author: Nick Mathewson
Created: 2-Dec-2008
Status: Accepted
Target: 0.2.1.x
History:
2008 Dec 2, 22:34
Changed name of cross certification field to match the other authority
certificate fields.
Status:
As of 0.2.1.9-alpha:
Cross-certification is implemented for new certificates, but not yet
required. Directories support the tor/keys/fp-sk urls.
Overview:
Tor's directory specification gives two ways to download a certificate:
by its identity fingerprint, or by the digest of its signing key. Both
are error-prone. We propose a new download mechanism to make sure that
clients get the certificates they want.
Motivation:
When a client wants a certificate to verify a consensus, it has two choices
currently:
- Download by identity key fingerprint. In this case, the client risks
getting a certificate for the same authority, but with a different
signing key than the one used to sign the consensus.
- Download by signing key fingerprint. In this case, the client risks
getting a forged certificate that contains the right signing key
signed with the wrong identity key. (Since caches are willing to
cache certs from authorities they do not themselves recognize, the
attacker wouldn't need to compromise an authority's key to do this.)
Current solution:
Clients fetch by identity keys, and re-fetch with backoff if they don't get
certs with the signing key they want.
Proposed solution:
Phase 1: Add a URL type for clients to download certs by identity _and_
signing key fingerprint. Unless both fields match, the client doesn't
accept the certificate(s). Clients begin using this method when their
randomly chosen directory cache supports it.
Phase 1A: Simultaneously, add a cross-certification element to
certificates.
Phase 2: Once many directory caches support phase 1, clients should prefer
to fetch certificates using that protocol when available.
Phase 2A: Once all authorities are generating cross-certified certificates
as in phase 1A, require cross-certification.
Specification additions:
The key certificate whose identity key fingerprint is <F> and whose signing
key fingerprint is <S> should be available at:
http://<hostname>/tor/keys/fp-sk/<F>-<S>.z
As usual, clients may request multiple certificates using:
http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z
Clients SHOULD use this format whenever they know both key fingerprints for
a desired certificate.
Certificates SHOULD contain the following field (at most once):
"dir-key-crosscert" NL CrossSignature NL
where CrossSignature is a signature, made using the certificate's signing
key, of the digest of the PKCS1-padded hash of the certificate's identity
key. For backward compatibility with broken versions of the parser, we
wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and
-----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow
the "ID " portion to be omitted, however.
When encountering a certificate with a dir-key-crosscert entry,
implementations MUST verify that the signature is a correct signature of
the hash of the identity key using the signing key.
(In a future version of this specification, dir-key-crosscert entries will
be required.)
Why cross-certify too?
Cross-certification protects clients who haven't updated yet, by reducing
the number of caches that are willing to hold and serve bogus certificates.
References:
This is related to part 2 of bug 854.

View File

@ -1,198 +0,0 @@
Filename: 158-microdescriptors.txt
Title: Clients download consensus + microdescriptors
Author: Roger Dingledine
Created: 17-Jan-2009
Status: Open
0. History
15 May 2009: Substantially revised based on discussions on or-dev
from late January. Removed the notion of voting on how to choose
microdescriptors; made it just a function of the consensus method.
(This lets us avoid the possibility of "desynchronization.")
Added suggestion to use a new consensus flavor. Specified use of
SHA256 for new hashes. -nickm
15 June 2009: Cleaned up based on comments from Roger. -nickm
1. Overview
This proposal replaces section 3.2 of proposal 141, which was
called "Fetching descriptors on demand". Rather than modifying the
circuit-building protocol to fetch a server descriptor inline at each
circuit extend, we instead put all of the information that clients need
either into the consensus itself, or into a new set of data about each
relay called a microdescriptor.
Descriptor elements that are small and frequently changing should go
in the consensus itself, and descriptor elements that are small and
relatively static should go in the microdescriptor. If we ever end up
with descriptor elements that aren't small yet clients need to know
them, we'll need to resume considering some design like the one in
proposal 141.
Note also that any descriptor element which clients need to use to
decide which servers to fetch info about, or which servers to fetch
info from, needs to stay in the consensus.
2. Motivation
See
http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
http://archives.seul.org/or/dev/Nov-2008/msg00007.html
for a discussion of the options and why this is currently the best
approach.
3. Design
There are three pieces to the proposal. First, authorities will list in
their votes (and thus in the consensus) the expected hash of
microdescriptor for each relay. Second, authorities will serve
microdescriptors, directory mirrors will cache and serve
them. Third, clients will ask for them and cache them.
3.1. Consensus changes
If the authorities choose a consensus method of a given version or
later, a microdescriptor format is implicit in that version.
A microdescriptor should in every case be a pure function of the
router descriptor and the consensus method.
In votes, we need to include the hash of each expected microdescriptor
in the routerstatus section. I suggest a new "m" line for each stanza,
with the base64 of the SHA256 hash of the router's microdescriptor.
For every consensus method that an authority supports, it includes a
separate "m" line in each router section of its vote, containing:
"m" SP methods 1*(SP AlgorithmName "=" digest) NL
where methods is a comma-separated list of the consensus methods
that the authority believes will produce "digest".
(As with base64 encoding of SHA1 hashes in consensuses, let's
omit the trailing =s)
The consensus microdescriptor-elements and "m" lines are then computed
as described in Section 3.1.2 below.
(This means we need a new consensus-method that knows
how to compute the microdescriptor-elements and add "m" lines.)
The microdescriptor consensus uses the directory-signature format from
proposal 162, with the "sha256" algorithm.
3.1.1. Descriptor elements to include for now
In the first version, the microdescriptor should contain the
onion-key element, and the family element from the router descriptor,
and the exit policy summary as currently specified in dir-spec.txt.
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
When we are generating a consensus, we use whichever m line
unambiguously corresponds to the descriptor digest that will be
included in the consensus.
(If different votes have different microdescriptor digests for a
single <descriptor-digest, consensus-method> pair, then at least one
of the authorities is broken. If this happens, the consensus should
contain whichever microdescriptor digest is most common. If there is
no winner, we break ties in the favor of the lexically earliest.
Either way, we should log a warning: there is definitely a bug.)
The "m" lines in a consensus contain only the digest, not a list of
consensus methods.
3.1.3. A new flavor of consensus
Rather than inserting "m" lines in the current consensus format,
they should be included in a new consensus flavor (see proposal
162).
This flavor can safely omit descriptor digests.
When we implement this voting method, we can remove the exit policy
summary from the current "ns" flavor of consensus, since no current
clients use them, and they take up about 5% of the compressed
consensus.
This new consensus flavor should be signed with the sha256 signature
format as documented in proposal 162.
3.2. Directory mirrors fetch, cache, and serve microdescriptors
Directory mirrors should fetch, catch, and serve each microdescriptor
from the authorities. (They need to continue to serve normal relay
descriptors too, to handle old clients.)
The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
available at:
http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
(We use base64 for size and for consistency with the consensus
format. We use -s instead of +s to separate these items, since
the + character is used in base64 encoding.)
All the microdescriptors from the current consensus should also be
available at:
http://<hostname>/tor/micro/all.z
so a client that's bootstrapping doesn't need to send a 70KB URL just
to name every microdescriptor it's looking for.
Microdescriptors have no header or footer.
The hash of the microdescriptor is simply the hash of the concatenated
elements.
Directory mirrors should check to make sure that the microdescriptors
they're about to serve match the right hashes (either the hashes from
the fetch URL or the hashes from the consensus, respectively).
We will probably want to consider some sort of smart data structure to
be able to quickly convert microdescriptor hashes into the appropriate
microdescriptor. Clients will want this anyway when they load their
microdescriptor cache and want to match it up with the consensus to
see what's missing.
3.3. Clients fetch them and cache them
When a client gets a new consensus, it looks to see if there are any
microdescriptors it needs to learn. If it needs to learn more than
some threshold of the microdescriptors (half?), it requests 'all',
else it requests only the missing ones. Clients MAY try to
determine whether the upload bandwidth for listing the
microdescriptors they want is more or less than the download
bandwidth for the microdescriptors they do not want.
Clients maintain a cache of microdescriptors along with metadata like
when it was last referenced by a consensus, and which identity key
it corresponds to. They keep a microdescriptor
until it hasn't been mentioned in any consensus for a week. Future
clients might cache them for longer or shorter times.
3.3.1. Information leaks from clients
If a client asks you for a set of microdescs, then you know she didn't
have them cached before. How much does that leak? What about when
we're all using our entry guards as directory guards, and we've seen
that user make a bunch of circuits already?
Fetching "all" when you need at least half is a good first order fix,
but might not be all there is to it.
Another future option would be to fetch some of the microdescriptors
anonymously (via a Tor circuit).
Another crazy option (Roger's phrasing) is to do decoy fetches as
well.
4. Transition and deployment
Phase one, the directory authorities should start voting on
microdescriptors, and putting them in the consensus.
Phase two, directory mirrors should learn how to serve them, and learn
how to read the consensus to find out what they should be serving.
Phase three, clients should start fetching and caching them instead
of normal descriptors.

View File

@ -1,142 +0,0 @@
Filename: 159-exit-scanning.txt
Title: Exit Scanning
Author: Mike Perry
Created: 13-Feb-2009
Status: Open
Overview:
This proposal describes the implementation and integration of an
automated exit node scanner for scanning the Tor network for malicious,
misconfigured, firewalled or filtered nodes.
Motivation:
Tor exit nodes can be run by anyone with an Internet connection. Often,
these users aren't fully aware of limitations of their networking
setup. Content filters, antivirus software, advertisements injected by
their service providers, malicious upstream providers, and the resource
limitations of their computer or networking equipment have all been
observed on the current Tor network.
It is also possible that some nodes exist purely for malicious
purposes. In the past, there have been intermittent instances of
nodes spoofing SSH keys, as well as nodes being used for purposes of
plaintext surveillance.
While it is not realistic to expect to catch extremely targeted or
completely passive malicious adversaries, the goal is to prevent
malicious adversaries from deploying dragnet attacks against large
segments of the Tor userbase.
Scanning methodology:
The first scans to be implemented are HTTP, HTML, Javascript, and
SSL scans.
The HTTP scan scrapes Google for common filetype urls such as exe, msi,
doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and
compares the SHA1 hashes of the resulting content.
The SSL scan downloads certificates for all IPs a domain will locally
resolve to and compares these certificates to those seen over Tor. The
scanner notes if a domain had rotated certificates locally in the
results for each scan.
The HTML scan checks HTML, Javascript, and plugin content for
modifications. Because of the dynamic nature of most of the web, the
scanner has a number of mechanisms built in to filter out false
positives that are used when a change is noticed between Tor and
Non-Tor.
All tests also share a URL-based false positive filter that
automatically removes results retroactively if the number of failures
exceeds a certain percentage of nodes tested with the URL.
Deployment Stages:
To avoid instances where bugs cause us to mark exit nodes as BadExit
improperly, it is proposed that we begin use of the scanner in stages.
1. Manual Review:
In the first stage, basic scans will be run by a small number of
people while we stabilize the scanner. The scanner has the ability
to resume crashed scans, and to rescan nodes that fail various
tests.
2. Human Review:
In the second stage, results will be automatically mailed to
an email list of interested parties for review. We will also begin
classifying failure types into three to four different severity
levels, based on both the reliability of the test and the nature of
the failure.
3. Automatic BadExit Marking:
In the final stage, the scanner will begin marking exits depending
on the failure severity level in one of three different ways: by
node idhex, by node IP, or by node IP mask. A potential fourth, less
severe category of results may still be delivered via email only for
review.
BadExit markings will be delivered in batches upon completion
of whole-network scans, so that the final false positive
filter has an opportunity to filter out URLs that exhibit
dynamic content beyond what we can filter.
Specification of Exit Marking:
Technically, BadExit could be marked via SETCONF AuthDirBadExit over
the control port, but this would allow full access to the directory
authority configuration and operation.
The approved-routers file could also be used, but currently it only
supports fingerprints, and it also contains other data unrelated to
exit scanning that would be difficult to coordinate.
Instead, we propose that a new badexit-routers file that has three
keywords:
BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt]
BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt]
BadExitNet lines would follow the codepaths used by AuthDirBadExit to
set authdir_badexit_policy, and BadExitFP would follow the codepaths
from approved-router's !badexit lines.
The scanner would have exclusive ability to write, append, rewrite,
and modify this file. Prior to building a new consensus vote, a
participating Tor authority would read in a fresh copy.
Security Implications:
Aside from evading the scanner's detection, there are two additional
high-level security considerations:
1. Ensure nodes cannot be marked BadExit by an adversary at will
It is possible individual website owners will be able to target certain
Tor nodes, but once they begin to attempt to fail more than the URL
filter percentage of the exits, their sites will be automatically
discarded.
Failing specific nodes is possible, but scanned results are fully
reproducible, and BadExits should be rare enough that humans are never
fully removed from the loop.
State (cookies, cache, etc) does not otherwise persist in the scanner
between exit nodes to enable one exit node to bias the results of a
later one.
2. Ensure that scanner compromise does not yield authority compromise
Having a separate file that is under the exclusive control of the
scanner allows us to heavily isolate the scanner from the Tor
authority, potentially even running them on separate machines.

View File

@ -1,105 +0,0 @@
Filename: 160-bandwidth-offset.txt
Title: Authorities vote for bandwidth offsets in consensus
Author: Roger Dingledine
Created: 4-May-2009
Status: Finished
Target: 0.2.2.x
1. Motivation
As part of proposal 141, we moved the bandwidth value for each relay
into the consensus. Now clients can know how they should load balance
even before they've fetched the corresponding relay descriptors.
Putting the bandwidth in the consensus also lets the directory
authorities choose more accurate numbers to advertise, if we come up
with a better algorithm for deciding weightings.
Our original plan was to teach directory authorities how to measure
bandwidth themselves; then every authority would vote for the bandwidth
it prefers, and we'd take the median of votes as usual.
The problem comes when we have 7 authorities, and only a few of them
have smarter bandwidth allocation algorithms. So long as the majority
of them are voting for the number in the relay descriptor, the minority
that have better numbers will be ignored.
2. Options
One fix would be to demand that every authority also run the
new bandwidth measurement algorithms: in that case, part of the
responsibility of being an authority operator is that you need to run
this code too. But in practice we can't really require all current
authority operators to do that; and if we want to expand the set of
authority operators even further, it will become even more impractical.
Also, bandwidth testing adds load to the network, so we don't really
want to require that the number of concurrent bandwidth tests match
the number of authorities we have.
The better fix is to allow certain authorities to specify that they are
voting on bandwidth measurements: more accurate bandwidth values that
have actually been evaluated. In this way, authorities can vote on
the median measured value if sufficient measured votes exist for a router,
and otherwise fall back to the median value taken from the published router
descriptors.
3. Security implications
If only some authorities choose to vote on an offset, then a majority of
those voting authorities can arbitrarily change the bandwidth weighting
for the relay. At the extreme, if there's only one offset-voting
authority, then that authority can dictate which relays clients will
find attractive.
This problem isn't entirely new: we already have the worry wrt
the subset of authorities that vote for BadExit.
To make it not so bad, we should deploy at least three offset-voting
authorities.
Also, authorities that know how to vote for offsets should vote for
an offset of zero for new nodes, rather than choosing not to vote on
any offset in those cases.
4. Design
First, we need a new consensus method to support this new calculation.
Now v3 votes can have an additional value on the "w" line:
"w Bandwidth=X Measured=" INT.
Once we're using the new consensus method, the new way to compute the
Bandwidth weight is by checking if there are at least 3 "Measured"
votes. If so, the median of these is taken. Otherwise, the median
of the "Bandwidth=" values are taken, as described in Proposal 141.
Then the actual consensus looks just the same as it did before,
so clients never have to know that this additional calculation is
happening.
5. Implementation
The Measured values will be read from a file provided by the scanners
described in proposal 161. Files with a timestamp older than 3 days
will be ignored.
The file will be read in from dirserv_generate_networkstatus_vote_obj()
in a location specified by a new config option "V3MeasuredBandwidths".
A helper function will be called to populate new 'measured' and
'has_measured' fields of the routerstatus_t 'routerstatuses' list with
values read from this file.
An additional for_vote flag will be passed to
routerstatus_format_entry() from format_networkstatus_vote(), which will
indicate that the "Measured=" string should be appended to the "w Bandwith="
line with the measured value in the struct.
routerstatus_parse_entry_from_string() will be modified to parse the
"Measured=" lines into routerstatus_t struct fields.
Finally, networkstatus_compute_consensus() will set rs_out.bandwidth
to the median of the measured values if there are more than 3, otherwise
it will use the bandwidth value median as normal.

View File

@ -1,174 +0,0 @@
Title: Computing Bandwidth Adjustments
Filename: 161-computing-bandwidth-adjustments.txt
Author: Mike Perry
Created: 12-May-2009
Target: 0.2.2.x
Status: Finished
1. Motivation
There is high variance in the performance of the Tor network. Despite
our efforts to balance load evenly across the Tor nodes, some nodes are
significantly slower and more overloaded than others.
Proposal 160 describes how we can augment the directory authorities to
vote on measured bandwidths for routers. This proposal describes what
goes into the measuring process.
2. Measurement Selection
The general idea is to determine a load factor representing the ratio
of the capacity of measured nodes to the rest of the network. This load
factor could be computed from three potentially relevant statistics:
circuit failure rates, circuit extend times, or stream capacity.
Circuit failure rates and circuit extend times appear to be
non-linearly proportional to node load. We've observed that the same
nodes when scanned at US nighttime hours (when load is presumably
lower) exhibit almost no circuit failure, and significantly faster
extend times than when scanned during the day.
Stream capacity, however, is much more uniform, even during US
nighttime hours. Moreover, it is a more intuitive representation of
node capacity, and also less dependent upon distance and latency
if amortized over large stream fetches.
3. Average Stream Bandwidth Calculation
The average stream bandwidths are obtained by dividing the network into
slices of 50 nodes each, grouped according to advertised node bandwidth.
Two hop circuits are built using nodes from the same slice, and a large
file is downloaded via these circuits. The file sizes are set based
on node percentile rank as follows:
0-10: 2M
10-20: 1M
20-30: 512k
30-50: 256k
50-100: 128k
These sizes are based on measurements performed during test scans.
This process is repeated until each node has been chosen to participate
in at least 5 circuits.
4. Ratio Calculation
The ratios are calculated by dividing each measured value by the
network-wide average.
5. Ratio Filtering
After the base ratios are calculated, a second pass is performed
to remove any streams with nodes of ratios less than X=0.5 from
the results of other nodes. In addition, all outlying streams
with capacity of one standard deviation below a node's average
are also removed.
The final ratio result will be greater of the unfiltered ratio
and the filtered ratio.
6. Pseudocode for Ratio Calculation Algorithm
Here is the complete pseudocode for the ratio algorithm:
Slices = {S | S is 50 nodes of similar consensus capacity}
for S in Slices:
while exists node N in S with circ_chosen(N) < 7:
fetch_slice_file(build_2hop_circuit(N, (exit in S)))
for N in S:
BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
for N in S:
Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)}
BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
for N in all Slices:
Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)
ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
7. Security implications
The ratio filtering will deal with cases of sabotage by dropping
both very slow outliers in stream average calculations, as well
as dropping streams that used very slow nodes from the calculation
of other nodes.
This scheme will not address nodes that try to game the system by
providing better service to scanners. The scanners can be detected
at the entry by IP address, and at the exit by the destination fetch
IP.
Measures can be taken to obfuscate and separate the scanners' source
IP address from the directory authority IP address. For instance,
scans can happen offsite and the results can be rsynced into the
authorities. The destination server IP can also change.
Neither of these methods are foolproof, but such nodes can already
lie about their bandwidth to attract more traffic, so this solution
does not set us back any in that regard.
8. Parallelization
Because each slice takes as long as 6 hours to complete, we will want
to parallelize as much as possible. This will be done by concurrently
running multiple scanners from each authority to deal with different
segments of the network. Each scanner piece will continually loop
over a portion of the network, outputting files of the form:
node_id=<idhex> SP strm_bw=<BW_measured(N)> SP
filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL
The most recent file from each scanner will be periodically gathered
by another script that uses them to produce network-wide averages
and calculate ratios as per the algorithm in section 6. Because nodes
may shift in capacity, they may appear in more than one slice and/or
appear more than once in the file set. The most recently measured
line will be chosen in this case.
9. Integration with Proposal 160
The final results will be produced for the voting mechanism
described in Proposal 160 by multiplying the derived ratio by
the average published consensus bandwidth during the course of the
scan, and taking the weighted average with the previous consensus
bandwidth:
Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))
The Alpha parameter is a smoothing parameter intended to prevent
rapid oscillation between loaded and unloaded conditions. It is
currently fixed at 0.333.
The Round() step consists of rounding to the 3 most significant figures
in base10, and then rounding that result to the nearest 1000, with
a minimum value of 1000.
This will produce a new bandwidth value that will be output into a
file consisting of lines of the form:
node_id=<idhex> SP bw=<Bw_new> NL
The first line of the file will contain a timestamp in UNIX time()
seconds. This will be used by the authority to decide if the
measured values are too old to use.
This file can be either copied or rsynced into a directory readable
by the directory authority.

View File

@ -1,188 +0,0 @@
Filename: 162-consensus-flavors.txt
Title: Publish the consensus in multiple flavors
Author: Nick Mathewson
Created: 14-May-2009
Target: 0.2.2
Status: Open
Overview:
This proposal describes a way to publish each consensus in
multiple simultaneous formats, or "flavors". This will reduce the
amount of time needed to deploy new consensus-like documents, and
reduce the size of consensus documents in the long term.
Motivation:
In the future, we will almost surely want different fields and
data in the network-status document. Examples include:
- Publishing hashes of microdescriptors instead of hashes of
full descriptors (Proposal 158).
- Including different digests of descriptors, instead of the
perhaps-soon-to-be-totally-broken SHA1.
Note that in both cases, from the client's point of view, this
information _replaces_ older information. If we're using a
SHA256 hash, we don't need to see the SHA1. If clients only want
microdescriptors, they don't (necessarily) need to see hashes of
other things.
Our past approach to cases like this has been to shovel all of
the data into the consensus document. But this is rather poor
for bandwidth. Adding a single SHA256 hash to a consensus for
each router increases the compressed consensus size by 47%. In
comparison, replacing a single SHA1 hash with a SHA256 hash for
each listed router increases the consensus size by only 18%.
Design in brief:
Let the voting process remain as it is, until a consensus is
generated. With future versions of the voting algorithm, instead
of just a single consensus being generated, multiple consensus
"flavors" are produced.
Consensuses (all of them) include a list of which flavors are
being generated. Caches fetch and serve all flavors of consensus
that are listed, regardless of whether they can parse or validate
them, and serve them to clients. Thus, once this design is in
place, we won't need to deploy more cache changes in order to get
new flavors of consensus to be cached.
Clients download only the consensus flavor they want.
A note on hashes:
Everything in this document is specified to use SHA256, and to be
upgradeable to use better hashes in the future.
Spec modifications:
1. URLs and changes to the current consensus format.
Every consensus flavor has a name consisting of a sequence of one
or more alphanumeric characters and dashes. For compatibility
current descriptor flavor is called "ns".
The supported consensus flavors are defined as part of the
authorities' consensus method.
For each supported flavor, every authority calculates another
consensus document of as-yet-unspecified format, and exchanges
detached signatures for these documents as in the current consensus
design.
In addition to the consensus currently served at
/tor/status-vote/(current|next)/consensus.z and
/tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z ,
authorities serve another consensus of each flavor "F" from the
locations /tor/status-vote/(current|next)/consensus-F.z. and
/tor/status-vote/(current|next)/consensus-F/<FP1>+....z.
When caches serve these documents, they do so from the same
locations.
2. Document format: generic consensus.
The format of a flavored consensus is as-yet-unspecified, except
that the first line is:
"network-status-version" SP version SP flavor NL
where version is 3 or higher, and the flavor is a string
consisting of alphanumeric characters and dashes, matching the
corresponding flavor listed in the unflavored consensus.
3. Document format: detached signatures.
We amend the detached signature format to include more than one
consensus-digest line, and more than one set of signatures.
After the consensus-digest line, we allow more lines of the form:
"additional-digest" SP flavor SP algname SP digest NL
Before the directory-signature lines, we allow more entries of the form:
"additional-signature" SP flavor SP algname SP identity SP
signing-key-digest NL signature.
[We do not use "consensus-digest" or "directory-signature" for flavored
consensuses, since this could confuse older Tors.]
The consensus-signatures URL should contain the signatures
for _all_ flavors of consensus.
4. The consensus index:
Authorities additionally generate and serve a consensus-index
document. Its format is:
Header ValidAfter ValidUntil Documents Signatures
Header = "consensus-index" SP version NL
ValidAfter = as in a consensus
ValidUntil = as in a consensus
Documents = Document*
Document = "document" SP flavor SP SignedLength
1*(SP AlgorithmName "=" Digest) NL
Signatures = Signature*
Signature = "directory-signature" SP algname SP identity
SP signing-key-digest NL signature
There must be one Document line for each generated consensus flavor.
Each Document line describes the length of the signed portion of
a consensus (the signatures themselves are not included), along
with one or more digests of that signed portion. Digests are
given in hex. The algorithm "sha256" MUST be included; others
are allowed.
The algname part of a signature describes what algorithm was
used to hash the identity and signing keys, and to compute the
signature. The algorithm "sha256" MUST be recognized;
signatures with unrecognized algorithms MUST be ignored.
(See below).
The consensus index is made available at
/tor/status-vote/(current|next)/consensus-index.z.
Caches should fetch this document so they can check the
correctness of the different consensus documents they fetch.
They do not need to check anything about an unrecognized
consensus document beyond its digest and length.
4.1. The "sha256" signature format.
The 'SHA256' signature format for directory objects is defined as
the RSA signature of the OAEP+-padded SHA256 digest of the item to
be signed. When checking signatures, the signature MUST be treated
as valid if the signature material begins with SHA256(document);
this allows us to add other data later.
Considerations:
- We should not create a new flavor of consensus when adding a
field instead wouldn't be too onerous.
- We should not proliferate flavors lightly: clients will be
distinguishable based on which flavor they download.
Migration:
- Stage one: authorities begin generating and serving
consensus-index files.
- Stage two: Caches begin downloading consensus-index files,
validating them, and using them to decide what flavors of
consensus documents to cache. They download all listed
documents, and compare them to the digests given in the
consensus.
- Stage three: Once we want to make a significant change to the
consensus format, we deploy another flavor of consensus at the
authorities. This will immediately start getting cached by the
caches, and clients can start fetching the new flavor without
waiting a version or two for enough caches to begin supporting
it.
Acknowledgements:
Aspects of this design and its applications to hash migration were
heavily influenced by IRC conversations with Marian.

View File

@ -1,115 +0,0 @@
Filename: 163-detecting-clients.txt
Title: Detecting whether a connection comes from a client
Author: Nick Mathewson
Created: 22-May-2009
Target: 0.2.2
Status: Open
Overview:
Some aspects of Tor's design require relays to distinguish
connections from clients from connections that come from relays.
The existing means for doing this is easy to spoof. We propose
a better approach.
Motivation:
There are at least two reasons for which Tor servers want to tell
which connections come from clients and which come from other
servers:
1) Some exits, proposal 152 notwithstanding, want to disallow
their use as single-hop proxies.
2) Some performance-related proposals involve prioritizing
traffic from relays, or limiting traffic per client (but not
per relay).
Right now, we detect client vs server status based on how the
client opens circuits. (Check out the code that implements the
AllowSingleHopExits option if you want all the details.) This
method is depressingly easy to fake, though. This document
proposes better means.
Goals:
To make grabbing relay privileges at least as difficult as just
running a relay.
In the analysis below, "using server privileges" means taking any
action that only servers are supposed to do, like delivering a
BEGIN cell to an exit node that doesn't allow single hop exits,
or claiming server-like amounts of bandwidth.
Passive detection:
A connection is definitely a client connection if it takes one of
the TLS methods during setup that does not establish an identity
key.
A circuit is definitely a client circuit if it is initiated with
a CREATE_FAST cell, though the node could be a client or a server.
A node that's listed in a recent consensus is probably a server.
A node to which we have successfully extended circuits from
multiple origins is probably a server.
Active detection:
If a node doesn't try to use server privileges at all, we never
need to care whether it's a server.
When a node or circuit tries to use server privileges, if it is
"definitely a client" as per above, we can refuse it immediately.
If it's "probably a server" as per above, we can accept it.
Otherwise, we have either a client, or a server that is neither
listed in any consensus or used by any other clients -- in other
words, a new or private server.
For these servers, we should attempt to build one or more test
circuits through them. If enough of the circuits succeed, the
node is a real relay. If not, it is probably a client.
While we are waiting for the test circuits to succeed, we should
allow a short grace period in which server privileges are
permitted. When a test is done, we should remember its outcome
for a while, so we don't need to do it again.
Why it's hard to do good testing:
Doing a test circuit starting with an unlisted router requires
only that we have an open connection for it. Doing a test
circuit starting elsewhere _through_ an unlisted router--though
more reliable-- would require that we have a known address, port,
identity key, and onion key for the router. Only the address and
identity key are easily available via the current Tor protocol in
all cases.
We could fix this part by requiring that all servers support
BEGIN_DIR and support downloading at least a current descriptor
for themselves.
Open questions:
What are the thresholds for the needed numbers of circuits
for us to decide that a node is a relay?
[Suggested answer: two circuits from two distinct hosts.]
How do we pick grace periods? How long do we remember the
outcome of a test?
[Suggested answer: 10 minute grace period; 48 hour memory of
test outcomes.]
If we can build circuits starting at a suspect node, but we don't
have enough information to try extending circuits elsewhere
through the node, should we conclude that the node is
"server-like" or not?
[Suggested answer: for now, just try making circuits through
the node. Extend this to extending circuits as needed.]

View File

@ -1,91 +0,0 @@
Filename: 164-reporting-server-status.txt
Title: Reporting the status of server votes
Author: Nick Mathewson
Created: 22-May-2009
Target: 0.2.2
Status: Open
Overview:
When a given node isn't listed in the directory, it isn't always easy
to tell why. This proposal suggest a quick-and-dirty way for
authorities to export not only how they voted, but why, and a way to
collate the information.
Motivation:
Right now, if you want to know the reason why your server was listed
a certain way in the Tor directory, the following steps are
recommended:
- Look through your log for reports of what the authority said
when you tried to upload.
- Look at the consensus; see if you're listed.
- Wait a while, see if things get better.
- Download the votes from all the authorities, and see how they
voted. Try to figure out why.
- If you think they'll listen to you, ask some authority
operators to look you up in their mtbf files and logs to see
why they voted as they did.
This is far too hard.
Solution:
We should add a new vote-like information-only document that
authorities serve on request. Call it a "vote info". It is
generated at the same time as a vote, but used only for
determining why a server voted as it did. It is served from
/tor/status-vote-info/current/authority[.z]
It differs from a vote in that:
* Its vote-status field is 'vote-info'.
* It includes routers that the authority would not include
in its vote.
For these, it includes an "omitted" line with an English
message explaining why they were omitted.
* For each router, it includes a line describing its WFU and
MTBF. The format is:
"stability <mtbf> up-since='date'"
"uptime <wfu> down-since='date'"
* It describes the WFU and MTBF thresholds it requires to
vote for a given router in various roles in the header.
The format is:
"flag-requirement <flag-name> <field> <op> <value>"
e.g.
"flag-requirement Guard uptime > 80"
* It includes info on routers all of whose descriptors that
were uploaded but rejected over the past few hours. The
"r" lines for these are the same as for regular routers.
The other lines are omitted for these routers, and are
replaced with a single "rejected" line, explaining (in
English) why the router was rejected.
A status site (like Torweather or Torstatus or another
tool) can poll these files when they are generated, collate
the data, and make it available to server operators.
Risks:
This document makes no provisions for caching these "vote
info" documents. If many people wind up fetching them
aggressively from the authorities, that would be bad.

View File

@ -1,133 +0,0 @@
Filename: 165-simple-robust-voting.txt
Title: Easy migration for voting authority sets
Author: Nick Mathewson
Created: 2009-05-28
Status: Open
Overview:
This proposal describes any easy-to-implement, easy-to-verify way to
change the set of authorities without creating a "flag day" situation.
Motivation:
From proposal 134 ("More robust consensus voting with diverse
authority sets") by Peter Palfrader:
Right now there are about five authoritative directory servers
in the Tor network, tho this number is expected to rise to about
15 eventually.
Adding a new authority requires synchronized action from all
operators of directory authorities so that at any time during the
update at least half of all authorities are running and agree on
who is an authority. The latter requirement is there so that the
authorities can arrive at a common consensus: Each authority
builds the consensus based on the votes from all authorities it
recognizes, and so a different set of recognized authorities will
lead to a different consensus document.
In response to this problem, proposal 134 suggested that every
candidate authority list in its vote whom it believes to be an
authority. These A-says-B-is-an-authority relationships form a
directed graph. Each authority then iteratively finds the largest
clique in the graph and remove it, until they find one containing
them. They vote with this clique.
Proposal 134 had some problems:
- It had a security problem in that M hostile authorities in a
clique could effectively kick out M-1 honest authorities. This
could enable a minority of the original authorities to take over.
- It was too complex in its implications to analyze well: it took us
over a year to realize that it was insecure.
- It tried to solve a bigger problem: general fragmentation of
authority trust. Really, all we wanted to have was the ability to
add and remove authorities without forcing a flag day.
Proposed protocol design:
A "Voting Set" is a set of authorities. Each authority has a list of
the voting sets it considers acceptable. These sets are chosen
manually by the authority operators. They must always contain the
authority itself. Each authority lists all of these voting sets in
its votes.
Authorities exchange votes with every other authority in any of their
voting sets.
When it is time to calculate a consensus, an authority votes with
whichever voting set it lists that is listed by the most members of
that set. In other words, given two sets S1 and S2 that an authority
lists, that authority will prefer to vote with S1 over S2 whenever
the number of other authorities in S1 that themselves list S1 is
higher than the number of other authorities in S2 that themselves
list S2.
For example, suppose authority A recognizes two sets, "A B C D" and
"A E F G H". Suppose that the first set is recognized by all of A,
B, C, and D, whereas the second set is recognized only by A, E, and
F. Because the first set is recognize by more of the authorities in
it than the other one, A will vote with the first set.
Ties are broken in favor of some arbitrary function of the identity
keys of the authorities in the set.
How to migrate authority sets:
In steady state, each authority operator should list only the current
actual voting set as accepted.
When we want to add an authority, each authority operator configures
his or her server to list two voting sets: one containing all the old
authorities, and one containing the old authorities and the new
authority too. Once all authorities are listing the new set of
authorities, they will start voting with that set because of its
size.
What if one or two authority operators are slow to list the new set?
Then the other operators can stop listing the old set once there are
enough authorities listing the new set to make its voting successful.
(Note that these authorities not listing the new set will still have
their votes counted, since they themselves will be members of the new
set. They will only fail to sign the consensus generated by the
other authorities who are using the new set.)
When we want to remove an authority, the operators list two voting
sets: one containing all the authorities, and one omitting the
authority we want to remove. Once enough authorities list the new
set as acceptable, we start having authority operators stop listing
the old set. Once there are more listing the new set than the old
set, the new set will win.
Data format changes:
Add a new 'voting-set' line to the vote document format. Allow it to
occur any number of times. Its format is:
voting-set SP 'fingerprint' SP 'fingerprint' ... NL
where each fingerprint is the hex fingerprint of an identity key of
an authority. Sort fingerprints in ascending order.
When the consensus method is at least 'X' (decide this when we
implement the proposal), add this line to the consensus format as
well, before the first dir-source line. [This information is not
redundant with the dir-source sections in the consensus: If an
authority is recognized but didn't vote, that authority will appear in
the voting-set line but not in the dir-source sections.]
We don't need to list other information about authorities in our
vote.
Migration issues:
We should keep track somewhere of which Tor client versions
recognized which authorities.
Acknowledgments:
The design came out of an IRC conversation with Peter Palfrader. He
had the basic idea first.

View File

@ -1,391 +0,0 @@
Filename: 166-statistics-extra-info-docs.txt
Title: Including Network Statistics in Extra-Info Documents
Author: Karsten Loesing
Created: 21-Jul-2009
Target: 0.2.2
Status: Accepted
Change history:
21-Jul-2009 Initial proposal for or-dev
Overview:
The Tor network has grown to almost two thousand relays and millions
of casual users over the past few years. With growth has come
increasing performance problems and attempts by some countries to
block access to the Tor network. In order to address these problems,
we need to learn more about the Tor network. This proposal suggests to
measure additional statistics and include them in extra-info documents
to help us understand the Tor network better.
Introduction:
As of May 2009, relays, bridges, and directories gather the following
data for statistical purposes:
- Relays and bridges count the number of bytes that they have pushed
in 15-minute intervals over the past 24 hours. Relays and bridges
include these data in extra-info documents that they send to the
directory authorities whenever they publish their server descriptor.
- Bridges further include a rough number of clients per country that
they have seen in the past 48 hours in their extra-info documents.
- Directories can be configured to count the number of clients they
see per country in the past 24 hours and to write them to a local
file.
Since then we extended the network statistics in Tor. These statistics
include:
- Directories now gather more precise statistics about connecting
clients. Fixes include measuring in intervals of exactly 24 hours,
counting unsuccessful requests, measuring download times, etc. The
directories append their statistics to a local file every 24 hours.
- Entry guards count the number of clients per country per day like
bridges do and write them to a local file every 24 hours.
- Relays measure statistics of the number of cells in their circuit
queues and how much time these cells spend waiting there. Relays
write these statistics to a local file every 24 hours.
- Exit nodes count the number of read and written bytes on exit
connections per port as well as the number of opened exit streams
per port in 24-hour intervals. Exit nodes write their statistics to
a local file.
The following four sections contain descriptions for adding these
statistics to the relays' extra-info documents.
Directory request statistics:
The first type of statistics aims at measuring directory requests sent
by clients to a directory mirror or directory authority. More
precisely, these statistics aim at requests for v2 and v3 network
statuses only. These directory requests are sent non-anonymously,
either via HTTP-like requests to a directory's Dir port or tunneled
over a 1-hop circuit.
Measuring directory request statistics is useful for several reasons:
First, the number of locally seen directory requests can be used to
estimate the total number of clients in the Tor network. Second, the
country-wise classification of requests using a GeoIP database can
help counting the relative and absolute number of users per country.
Third, the download times can give hints on the available bandwidth
capacity at clients.
Directory requests do not give any hints on the contents that clients
send or receive over the Tor network. Every client requests network
statuses from the directories, so that there are no anonymity-related
concerns to gather these statistics. It might be, though, that clients
wish to hide the fact that they are connecting to the Tor network.
Therefore, IP addresses are resolved to country codes in memory,
events are accumulated over 24 hours, and numbers are rounded up to
multiples of 4 or 8.
"dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
is only added when the relay has opened its Dir port and after 24
hours of measuring directory requests.
"dirreq-v2-ips" CC=N,CC=N,... NL
[At most once.]
"dirreq-v3-ips" CC=N,CC=N,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
unique IP addresses that have connected from that country to
request a v2/v3 network status, rounded up to the nearest multiple
of 8. Only those IP addresses are counted that the directory can
answer with a 200 OK status code.
"dirreq-v2-reqs" CC=N,CC=N,... NL
[At most once.]
"dirreq-v3-reqs" CC=N,CC=N,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
requests for v2/v3 network statuses from that country, rounded up
to the nearest multiple of 8. Only those requests are counted that
the directory can answer with a 200 OK status code.
"dirreq-v2-share" num% NL
[At most once.]
"dirreq-v3-share" num% NL
[At most once.]
The share of v2/v3 network status requests that the directory
expects to receive from clients based on its advertised bandwidth
compared to the overall network bandwidth capacity. Shares are
formatted in percent with two decimal places. Shares are
calculated as means over the whole 24-hour interval.
"dirreq-v2-resp" status=num,... NL
[At most once.]
"dirreq-v3-resp" status=nul,... NL
[At most once.]
List of mappings from response statuses to the number of requests
for v2/v3 network statuses that were answered with that response
status, rounded up to the nearest multiple of 4. Only response
statuses with at least 1 response are reported. New response
statuses can be added at any time. The current list of response
statuses is as follows:
"ok": a network status request is answered; this number
corresponds to the sum of all requests as reported in
"dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
rounding up.
"not-enough-sigs: a version 3 network status is not signed by a
sufficient number of requested authorities.
"unavailable": a requested network status object is unavailable.
"not-found": a requested network status is not found.
"not-modified": a network status has not been modified since the
If-Modified-Since time that is included in the request.
"busy": the directory is busy.
"dirreq-v2-direct-dl" key=val,... NL
[At most once.]
"dirreq-v3-direct-dl" key=val,... NL
[At most once.]
"dirreq-v2-tunneled-dl" key=val,... NL
[At most once.]
"dirreq-v3-tunneled-dl" key=val,... NL
[At most once.]
List of statistics about possible failures in the download process
of v2/v3 network statuses. Requests are either "direct"
HTTP-encoded requests over the relay's directory port, or
"tunneled" requests using a BEGIN_DIR cell over the relay's OR
port. The list of possible statistics can change, and statistics
can be left out from reporting. The current list of statistics is
as follows:
Successful downloads and failures:
"complete": a client has finished the download successfully.
"timeout": a download did not finish within 10 minutes after
starting to send the response.
"running": a download is still running at the end of the
measurement period for less than 10 minutes after starting to
send the response.
Download times:
"min", "max": smallest and largest measured bandwidth in B/s.
"d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
bandwidth in B/s. For a given decile i, i/10 of all downloads
had a smaller bandwidth than di, and (10-i)/10 of all downloads
had a larger bandwidth than di.
"q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
fourth of all downloads had a smaller bandwidth than q1, one
fourth of all downloads had a larger bandwidth than q3, and the
remaining half of all downloads had a bandwidth between q1 and
q3.
"md": median of measured bandwidth in B/s. Half of the downloads
had a smaller bandwidth than md, the other half had a larger
bandwidth than md.
Entry guard statistics:
Entry guard statistics include the number of clients per country and
per day that are connecting directly to an entry guard.
Entry guard statistics are important to learn more about the
distribution of clients to countries. In the future, this knowledge
can be useful to detect if there are or start to be any restrictions
for clients connecting from specific countries.
The information which client connects to a given entry guard is very
sensitive. This information must not be combined with the information
what contents are leaving the network at the exit nodes. Therefore,
entry guard statistics need to be aggregated to prevent them from
becoming useful for de-anonymization. Aggregation includes resolving
IP addresses to country codes, counting events over 24-hour intervals,
and rounding up numbers to the next multiple of 8.
"entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
An "entry-stats-end" line, as well as any other "entry-*"
line, is first added after the relay has been running for at least
24 hours.
"entry-ips" CC=N,CC=N,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
unique IP addresses that have connected from that country to the
relay and which are no known other relays, rounded up to the
nearest multiple of 8.
Cell statistics:
The third type of statistics have to do with the time that cells spend
in circuit queues. In order to gather these statistics, the relay
memorizes when it puts a given cell in a circuit queue and when this
cell is flushed. The relay further notes the life time of the circuit.
These data are sufficient to determine the mean number of cells in a
queue over time and the mean time that cells spend in a queue.
Cell statistics are necessary to learn more about possible reasons for
the poor network performance of the Tor network, especially high
latencies. The same statistics are also useful to determine the
effects of design changes by comparing today's data with future data.
There are basically no privacy concerns from measuring cell
statistics, regardless of a node being an entry, middle, or exit node.
"cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
A "cell-stats-end" line, as well as any other "cell-*" line,
is first added after the relay has been running for at least 24
hours.
"cell-processed-cells" num,...,num NL
[At most once.]
Mean number of processed cells per circuit, subdivided into
deciles of circuits by the number of cells they have processed in
descending order from loudest to quietest circuits.
"cell-queued-cells" num,...,num NL
[At most once.]
Mean number of cells contained in queues by circuit decile. These
means are calculated by 1) determining the mean number of cells in
a single circuit between its creation and its termination and 2)
calculating the mean for all circuits in a given decile as
determined in "cell-processed-cells". Numbers have a precision of
two decimal places.
"cell-time-in-queue" num,...,num NL
[At most once.]
Mean time cells spend in circuit queues in milliseconds. Times are
calculated by 1) determining the mean time cells spend in the
queue of a single circuit and 2) calculating the mean for all
circuits in a given decile as determined in
"cell-processed-cells".
"cell-circuits-per-decile" num NL
[At most once.]
Mean number of circuits that are included in any of the deciles,
rounded up to the next integer.
Exit statistics:
The last type of statistics affects exit nodes counting the number of
bytes written and read and the number of streams opened per port and
per 24 hours. Exit port statistics can be measured from looking at
headers of BEGIN and DATA cells. A BEGIN cell contains the exit port
that is required for the exit node to open a new exit stream.
Subsequent DATA cells coming from the client or being sent back to the
client contain a length field stating how many bytes of application
data are contained in the cell.
Exit port statistics are important to measure in order to identify
possible load-balancing problems with respect to exit policies. Exit
nodes that permit more ports than others are very likely overloaded
with traffic for those ports plus traffic for other ports. Improving
load balancing in the Tor network improves the overall utilization of
bandwidth capacity.
Exit traffic is one of the most sensitive parts of network data in the
Tor network. Even though these statistics do not require looking at
traffic contents, statistics are aggregated so that they are not
useful for de-anonymizing users. Only those ports are reported that
have seen at least 0.1% of exiting or incoming bytes, numbers of bytes
are rounded up to full kibibytes (KiB), and stream numbers are rounded
up to the next multiple of 4.
"exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
An "exit-stats-end" line, as well as any other "exit-*" line, is
first added after the relay has been running for at least 24 hours
and only if the relay permits exiting (where exiting to a single
port and IP address is sufficient).
"exit-kibibytes-written" port=N,port=N,... NL
[At most once.]
"exit-kibibytes-read" port=N,port=N,... NL
[At most once.]
List of mappings from ports to the number of kibibytes that the
relay has written to or read from exit connections to that port,
rounded up to the next full kibibyte.
"exit-streams-opened" port=N,port=N,... NL
[At most once.]
List of mappings from ports to the number of opened exit streams
to that port, rounded up to the nearest multiple of 4.
Implementation notes:
Right now, relays that are configured accordingly write similar
statistics to those described in this proposal to disk every 24 hours.
With this proposal being implemented, relays include the contents of
these files in extra-info documents.
The following steps are necessary to implement this proposal:
1. The current format of [dirreq|entry|buffer|exit]-stats files needs
to be adapted to the description in this proposal. This step
basically means renaming keywords.
2. The timing of writing the four *-stats files should be unified, so
that they are written exactly 24 hours after starting the
relay. Right now, the measurement intervals for dirreq, entry, and
exit stats starts with the first observed request, and files are
written when observing the first request that occurs more than 24
hours after the beginning of the measurement interval. With this
proposal, the measurement intervals should all start at the same
time, and files should be written exactly 24 hours later.
3. It is advantageous to cache statistics in local files in the data
directory until they are included in extra-info documents. The
reason is that the 24-hour measurement interval can be very
different from the 18-hour publication interval of extra-info
documents. When a relay crashes after finishing a measurement
interval, but before publishing the next extra-info document,
statistics would get lost. Therefore, statistics are written to
disk when finishing a measurement interval and read from disk when
generating an extra-info document. Only the statistics that were
appended to the *-stats files within the past 24 hours are included
in extra-info documents. Further, the contents of the *-stats files
need to be checked in the process of generating extra-info documents.
4. With the statistics patches being tested, the ./configure options
should be removed and the statistics code be compiled by default.
It is still required for relay operators to add configuration
options (DirReqStatistics, ExitPortStatistics, etc.) to enable
gathering statistics. However, in the near future, statistics shall
be enabled gathered by all relays by default, where requiring a
./configure option would be a barrier for many relay operators.

View File

@ -1,47 +0,0 @@
Filename: 167-params-in-consensus.txt
Title: Vote on network parameters in consensus
Author: Roger Dingledine
Created: 18-Aug-2009
Status: Closed
Implemented-In: 0.2.2
0. History
1. Overview
Several of our new performance plans involve guessing how to tune
clients and relays, yet we won't be able to learn whether we guessed
the right tuning parameters until many people have upgraded. Instead,
we should have directory authorities vote on the parameters, and teach
Tors to read the currently recommended values out of the consensus.
2. Design
V3 votes should include a new "params" line after the known-flags
line. It contains key=value pairs, where value is an integer.
Consensus documents that are generated with a sufficiently new consensus
method (7?) then include a params line that includes every key listed
in any vote, and the median value for that key (in case of ties,
we use the median closer to zero).
2.1. Planned keys.
The first planned parameter is "circwindow=101", which is the initial
circuit packaging window that clients and relays should use. Putting
it in the consensus will let us perform experiments with different
values once enough Tors have upgraded -- see proposal 168.
Later parameters might include a weighting for how much to favor quiet
circuits over loud circuits in our round-robin algorithm; a weighting
for how much to prioritize relays over clients if we use an incentive
scheme like the gold-star design; and what fraction of circuits we
should throw out from proposal 151.
2.2. What about non-integers?
I'm not sure how we would do median on non-integer values. Further,
I don't have any non-integer values in mind yet. So I say we cross
that bridge when we get to it.

View File

@ -1,134 +0,0 @@
Filename: 168-reduce-circwindow.txt
Title: Reduce default circuit window
Author: Roger Dingledine
Created: 12-Aug-2009
Status: Open
Target: 0.2.2
0. History
1. Overview
We should reduce the starting circuit "package window" from 1000 to
101. The lower package window will mean that clients will only be able
to receive 101 cells (~50KB) on a circuit before they need to send a
'sendme' acknowledgement cell to request 100 more.
Starting with a lower package window on exit relays should save on
buffer sizes (and thus memory requirements for the exit relay), and
should save on queue sizes (and thus latency for users).
Lowering the package window will induce an extra round-trip for every
additional 50298 bytes of the circuit. This extra step is clearly a
slow-down for large streams, but ultimately we hope that a) clients
fetching smaller streams will see better response, and b) slowing
down the large streams in this way will produce lower e2e latencies,
so the round-trips won't be so bad.
2. Motivation
Karsten's torperf graphs show that the median download time for a 50KB
file over Tor in mid 2009 is 7.7 seconds, whereas the median download
time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
second figure is way too high, whereas the 50s and 150s figures are
surprisingly low.
The median round-trip latency appears to be around 2s, with 25% of
the data points taking more than 5s. That's a lot of variance.
We designed Tor originally with the original goal of maximizing
throughput. We figured that would also optimize other network properties
like round-trip latency. Looks like we were wrong.
3. Design
Wherever we initialize the circuit package window, initialize it to
101 rather than 1000. Reducing it should be safe even when interacting
with old Tors: the old Tors will receive the 101 cells and send back
a sendme ack cell. They'll still have much higher deliver windows,
but the rest of their deliver window will go unused.
You can find the patch at arma/circwindow. It seems to work.
3.1. Why not 100?
Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
ack cell after 101 cells rather than the intended 100 cells.
Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
hopefully we'll have moved to some datagram protocol long before
0.2.1.19 becomes obsolete.
3.2. What about stream packaging windows?
Right now the stream packaging windows start at 500. The goal was to
set the stream window to half the circuit window, to provide a crude
load balancing between streams on the same circuit. Once we lower
the circuit packaging window, the stream packaging window basically
becomes redundant.
We could leave it in -- it isn't hurting much in either case. Or we
could take it out -- people building other Tor clients would thank us
for that step. Alas, people building other Tor clients are going to
have to be compatible with current Tor clients, so in practice there's
no point taking out the stream packaging windows.
3.3. What about variable circuit windows?
Once upon a time we imagined adapting the circuit package window to
the network conditions. That is, we would start the window small,
and raise it based on the latency and throughput we see.
In theory that crude imitation of TCP's windowing system would allow
us to adapt to fill the network better. In practice, I think we want
to stick with the small window and never raise it. The low cap reduces
the total throughput you can get from Tor for a given circuit. But
that's a feature, not a bug.
4. Evaluation
How do we know this change is actually smart? It seems intuitive that
it's helpful, and some smart systems people have agreed that it's
a good idea (or said another way, they were shocked at how big the
default package window was before).
To get a more concrete sense of the benefit, though, Karsten has been
running torperf side-by-side on exit relays with the old package window
vs the new one. The results are mixed currently -- it is slightly faster
for fetching 40KB files, and slightly slower for fetching 50KB files.
I think it's going to be tough to get a clear conclusion that this is
a good design just by comparing one exit relay running the patch. The
trouble is that the other hops in the circuits are still getting bogged
down by other clients introducing too much traffic into the network.
Ultimately, we'll want to put the circwindow parameter into the
consensus so we can test a broader range of values once enough relays
have upgraded.
5. Transition and deployment
We should put the circwindow in the consensus (see proposal 167),
with an initial value of 101. Then as more exit relays upgrade,
clients should seamlessly get the better behavior.
Note that upgrading the exit relay will only affect the "download"
package window. An old client that's uploading lots of bytes will
continue to use the old package window at the client side, and we
can't throttle that window at the exit side without breaking protocol.
The real question then is what we should backport to 0.2.1. Assuming
this could be a big performance win, we can't afford to wait until
0.2.2.x comes out before starting to see the changes here. So we have
two options as I see them:
a) once clients in 0.2.2.x know how to read the value out of the
consensus, and it's been tested for a bit, backport that part to
0.2.1.x.
b) if it's too complex to backport, just pick a number, like 101, and
backport that number.
Clearly choice (a) is the better one if the consensus parsing part
isn't very complex. Let's shoot for that, and fall back to (b) if the
patch turns out to be so big that we reconsider.

View File

@ -1,404 +0,0 @@
Filename: 169-eliminating-renegotiation.txt
Title: Eliminate TLS renegotiation for the Tor connection handshake
Author: Nick Mathewson
Created: 27-Jan-2010
Status: Draft
Target: 0.2.2
1. Overview
I propose a backward-compatible change to the Tor connection
establishment protocol to avoid the use of TLS renegotiation.
Rather than doing a TLS renegotiation to exchange certificates
and authenticate the original handshake, this proposal takes an
approach similar to Steven Murdoch's proposal 124, and uses Tor
cells to finish authenticating the parties' identities once the
initial TLS handshake is finished.
Terminological note: I use "client" below to mean the Tor
instance (a client or a relay) that initiates a TLS connection,
and "server" to mean the Tor instance (a relay) that accepts it.
2. Motivation and history
In the original Tor TLS connection handshake protocol ("V1", or
"two-cert"), parties that wanted to authenticate provided a
two-cert chain of X.509 certificates during the handshake setup
phase. Every party that wanted to authenticate sent these
certificates.
In the current Tor TLS connection handshake protocol ("V2", or
"renegotiating"), the parties begin with a single certificate
sent from the server (responder) to the client (initiator), and
then renegotiate to a two-certs-from-each-authenticating party.
We made this change to make Tor's handshake look like a browser
speaking SSL to a webserver. (See proposal 130, and
tor-spec.txt.) To tell whether to use the V1 or V2 handshake,
servers look at the list of ciphers sent by the client. (This is
ugly, but there's not much else in the ClientHello that they can
look at.) If the list contains any cipher not used by the V1
protocol, the server sends back a single cert and expects a
renegotiation. If the client gets back a single cert, then it
withholds its own certificates until the TLS renegotiation phase.
In other words, initiator behavior now looks like this:
- Begin TLS negotiation with V2 cipher list; wait for
certificate(s).
- If we get a certificate chain:
- Then we are using the V1 handshake. Send our own
certificate chain as part of this initial TLS handshake
if we want to authenticate; otherwise, send no
certificates. When the handshake completes, check
certificates. We are now mutually authenticated.
Otherwise, if we get just a single certificate:
- Then we are using the V2 handshake. Do not send any
certificates during this handshake.
- When the handshake is done, immediately start a TLS
renegotiation. During the renegotiation, expect
a certificate chain from the server; send a certificate
chain of our own if we want to authenticate ourselves.
- After the renegotiation, check the certificates. Then
send (and expect) a VERSIONS cell from the other side to
establish the link protocol version.
And V2 responder behavior now looks like this:
- When we get a TLS ClientHello request, look at the cipher
list.
- If the cipher list contains only the V1 ciphersuites:
- Then we're doing a V1 handshake. Send a certificate
chain. Expect a possible client certificate chain in
response.
Otherwise, if we get other ciphersuites:
- We're using the V2 handshake. Send back a single
certificate and let the handshake complete.
- Do not accept any data until the client has renegotiated.
- When the client is renegotiating, send a certificate
chain, and expect (possibly multiple) certificates in
reply.
- Check the certificates when the renegotiation is done.
Then exchange VERSIONS cells.
Late in 2009, researchers found a flaw in most applications' use
of TLS renegotiation: Although TLS renegotiation does not
reauthenticate any information exchanged before the renegotiation
takes place, many applications were treating it as though it did,
and assuming that data sent _before_ the renegotiation was
authenticated with the credentials negotiated _during_ the
renegotiation. This problem was exacerbated by the fact that
most TLS libraries don't actually give you an obvious good way to
tell where the renegotiation occurred relative to the datastream.
Tor wasn't directly affected by this vulnerability, but its
aftermath hurts us in a few ways:
1) OpenSSL has disabled renegotiation by default, and created
a "yes we know what we're doing" option we need to set to
turn it back on. (Two options, actually: one for openssl
0.9.8l and one for 0.9.8m and later.)
2) Some vendors have removed all renegotiation support from
their versions of OpenSSL entirely, forcing us to tell
users to either replace their versions of OpenSSL or to
link Tor against a hand-built one.
3) Because of 1 and 2, I'd expect TLS renegotiation to become
rarer and rarer in the wild, making our own use stand out
more.
3. Design
3.1. The view in the large
Taking a cue from Steven Murdoch's proposal 124, I propose that
we move the work currently done by the TLS renegotiation step
(that is, authenticating the parties to one another) and do it
with Tor cells instead of with TLS.
Using _yet another_ variant response from the responder (server),
we allow the client to learn that it doesn't need to rehandshake
and can instead use a cell-based authentication system. Once the
TLS handshake is done, the client and server exchange VERSIONS
cells to determine link protocol version (including
handshake version). If they're using the handshake version
specified here, the client and server arrive at link protocol
version 3 (or higher), and use cells to exchange further
authentication information.
3.2. New TLS handshake variant
We already used the list of ciphers from the clienthello to
indicate whether the client can speak the V2 ("renegotiating")
handshake or later, so we can't encode more information there.
We can, however, change the DN in the certificate passed by the
server back to the client. Currently, all V2 certificates are
generated with CN values ending with ".net". I propose that we
have the ".net" commonName ending reserved to indicate the V2
protocol, and use commonName values ending with ".com" to
indicate the V3 ("minimal") handshake described herein.
Now, once the initial TLS handshake is done, the client can look
at the server's certificate(s). If there is a certificate chain,
the handshake is V1. If there is a single certificate whose
subject commonName ends in ".net", the handshake is V2 and the
client should try to renegotiate as it would currently.
Otherwise, the client should assume that the handshake is V3+.
[Servers should _only_ send ".com" addesses, to allow room for
more signaling in the future.]
3.3. Authenticating inside Tor
Once the TLS handshake is finished, if the client renegotiates,
then the server should go on as it does currently.
If the client implements this proposal, however, and the server
has shown it can understand the V3+ handshake protocol, the
client immediately sends a VERSIONS cell to the server
and waits to receive a VERSIONS cell in return. We negotiate
the Tor link protocol version _before_ we proceed with the
negotiation, in case we need to change the authentication
protocol in the future.
Once either party has seen the VERSIONS cell from the other, it
knows which version they will pick (that is, the highest version
shared by both parties' VERSIONS cells). All Tor instances using
the handshake protocol described in 3.2 MUST support at least
link protocol version 3 as described here.
On learning the link protocol, the server then sends the client a
CERT cell and a NETINFO cell. If the client wants to
authenticate to the server, it sends a CERT cell, an AUTHENTICATE
cell, and a NETINFO cell, or it may simply send a NETINFO cell if
it does not want to authenticate.
The CERT cell describes the keys that a Tor instance is claiming
to have. It is a variable-length cell. Its payload format is:
N: Number of certs in cell [1 octet]
N times:
CLEN [2 octets]
Certificate [CLEN octets]
Any extra octets at the end of a CERT cell MUST be ignored.
Each certificate has the form:
CertType [1 octet]
CertPurpose [1 octet]
PublicKeyLen [2 octets]
PublicKey [PublicKeyLen octets]
NotBefore [4 octets]
NotAfter [4 octets]
SignerID [HASH256_LEN octets]
SignatureLen [2 octets]
Signature [SignatureLen octets]
where CertType is 1 (meaning "RSA/SHA256")
CertPurpose is 1 (meaning "link certificate")
PublicKey is the DER encoding of the ASN.1 representation
of the RSA key of the subject of this certificate,
NotBefore is a time in HOURS since January 1, 1970, 00:00
UTC before which this certificate should not be
considered valid.
NotAfter is a time in HOURS since January 1, 1970, 00:00
UTC after which this certificate should not be
considered valid.
SignerID is the SHA-256 digest of the public key signing
this certificate
and Signature is the signature of the all other fields in
this certificate, using SHA256 as described in proposal
158.
While authenticating, a server need send only a self-signed
certificate for its identity key. (Its TLS certificate already
contains its link key signed by its identity key.) A client that
wants to authenticate MUST send two certificates: one containing
a public link key signed by its identity key, and one self-signed
cert for its identity.
Tor instances MUST ignore any certificate with an unrecognized
CertType or CertPurpose, and MUST ignore extra bytes in the cert.
The AUTHENTICATE cell proves to the server that the client with
whom it completed the initial TLS handshake is the one possessing
the link public key in its certificate. It is a variable-length
cell. Its contents are:
SignatureType [2 octets]
SignatureLen [2 octets]
Signature [SignatureLen octets]
where SignatureType is 1 (meaning "RSA-SHA256") and Signature is
an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master
secret key as its key, of the following elements:
- The SignatureType field (0x00 0x01)
- The NUL terminated ASCII string: "Tor certificate verification"
- client_random, as sent in the Client Hello
- server_random, as sent in the Server Hello
Once the above handshake is complete, the client knows (from the
initial TLS handshake) that it has a secure connection to an
entity that controls a given link public key, and knows (from the
CERT cell) that the link public key is a valid public key for a
given Tor identity.
If the client authenticates, the server learns from the CERT cell
that a given Tor identity has a given current public link key.
From the AUTHENTICATE cell, it knows that an entity with that
link key knows the master secret for the TLS connection, and
hence must be the party with whom it's talking, if TLS works.
3.4. Security checks
If the TLS handshake indicates a V2 or V3+ connection, the server
MUST reject any connection from the client that does not begin
with either a renegotiation attempt or a VERSIONS cell containing
at least link protocol version "3". If the TLS handshake
indicates a V3+ connection, the client MUST reject any connection
where the server sends anything before the client has sent a
VERSIONS cell, and any connection where the VERSIONS cell does
not contain at least link protocol version "3".
If link protocol version 3 is chosen:
Clients and servers MUST check that all digests and signatures
on the certificates in CERT cells they are given are as
described above.
After the VERSIONS cell, clients and servers MUST close the
connection if anything besides a CERT or AUTH cell is sent
before the
CERT or AUTHENTICATE cells anywhere after the first NETINFO
cell must be rejected.
... [write more here. What else?] ...
3.5. Summary
We now revisit the protocol outlines from section 2 to incorporate
our changes. New or modified steps are marked with a *.
The new initiator behavior now looks like this:
- Begin TLS negotiation with V2 cipher list; wait for
certificate(s).
- If we get a certificate chain:
- Then we are using the V1 handshake. Send our own
certificate chain as part of this initial TLS handshake
if we want to authenticate; otherwise, send no
certificates. When the handshake completes, check
certificates. We are now mutually authenticated.
Otherwise, if we get just a single certificate:
- Then we are using the V2 or the V3+ handshake. Do not
send any certificates during this handshake.
* When the handshake is done, look at the server's
certificate's subject commonName.
* If it ends with ".net", we're doing a V2 handshake:
- Immediately start a TLS renegotiation. During the
renegotiation, expect a certificate chain from the
server; send a certificate chain of our own if we
want to authenticate ourselves.
- After the renegotiation, check the certificates. Then
send (and expect) a VERSIONS cell from the other side
to establish the link protocol version.
* If it ends with anything else, assume a V3 or later
handshake:
* Send a VERSIONS cell, and wait for a VERSIONS cell
from the server.
* If we are authenticating, send CERT and AUTHENTICATE
cells.
* Send a NETINFO cell. Wait for a CERT and a NETINFO
cell from the server.
* If the CERT cell contains a valid self-identity cert,
and the identity key in the cert can be used to check
the signature on the x.509 certificate we got during
the TLS handshake, then we know we connected to the
server with that identity. If any of these checks
fail, or the identity key was not what we expected,
then we close the connection.
* Once the NETINFO cell arrives, continue as before.
And V3+ responder behavior now looks like this:
- When we get a TLS ClientHello request, look at the cipher
list.
- If the cipher list contains only the V1 ciphersuites:
- Then we're doing a V1 handshake. Send a certificate
chain. Expect a possible client certificate chain in
response.
Otherwise, if we get other ciphersuites:
- We're using the V2 handshake. Send back a single
certificate whose subject commonName ends with ".com",
and let the handshake complete.
* If the client does anything besides renegotiate or send a
VERSIONS cell, drop the connection.
- If the client renegotiates immediately, it's a V2
connection:
- When the client is renegotiating, send a certificate
chain, and expect (possibly multiple certificates in
reply).
- Check the certificates when the renegotiation is done.
Then exchange VERSIONS cells.
* Otherwise we got a VERSIONS cell and it's a V3 handshake.
* Send a VERSIONS cell, a CERT cell, an AUTHENTICATE
cell, and a NETINFO cell.
* Wait for the client to send cells in reply. If the
client sends a CERT and an AUTHENTICATE and a NETINFO,
use them to authenticate the client. If the client
sends a NETINFO, it is unauthenticated. If it sends
anything else before its NETINFO, it's rejected.
4. Numbers to assign
We need a version number for this link protocol. I've been
calling it "3".
We need to reserve command numbers for CERT and AUTH cells. I
suggest that in link protocol 3 and higher, we reserve command
numbers 128..240 for variable-length cells. (241-256 we can hold
for future extensions.
5. Efficiency
This protocol add a round-trip step when the client sends a
VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
NETINFO} response in turn. (The server then waits for the
client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
but it would have already been waiting for the client's NETINFO,
so that's not an additional wait.)
This is actually fewer round-trip steps than required before for
TLS renegotiation, so that's a win.
6. Open questions:
- Should we use X.509 certificates instead of the certificate-ish
things we describe here? They are more standard, but more ugly.
- May we cache which certificates we've already verified? It
might leak in timing whether we've connected with a given server
before, and how recently.
- Is there a better secret than the master secret to use in the
AUTHENTICATE cell? Say, a portable one? Can we get at it for
other libraries besides OpenSSL?
- Does using the client_random and server_random data in the
AUTHENTICATE message actually help us? How hard is it to pull
them out of the OpenSSL data structure?
- Can we give some way for clients to signal "I want to use the
V3 protocol if possible, but I can't renegotiate, so don't give
me the V2"? Clients currently have a fair idea of server
versions, so they could potentially do the V3+ handshake with
servers that support it, and fall back to V1 otherwise.
- What should servers that don't have TLS renegotiation do? For
now, I think they should just get it. Eventually we can
deprecate the V2 handshake as we did with the V1 handshake.

View File

@ -1,95 +0,0 @@
Title: Configuration options regarding circuit building
Filename: 170-user-path-config.txt
Author: Sebastian Hahn
Created: 01-March-2010
Status: Draft
Overview:
This document outlines how Tor handles the user configuration
options to influence the circuit building process.
Motivation:
Tor's treatment of the configuration *Nodes options was surprising
to many users, and quite a few conspiracy theories have crept up. We
should update our specification and code to better describe and
communicate what is going during circuit building, and how we're
honoring configuration. So far, we've been tracking a bugreport
about this behaviour (
https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 )
and Nick replied in a thread on or-talk (
http://archives.seul.org/or/talk/Feb-2010/msg00117.html ).
This proposal tries to document our intention for those configuration
options.
Design:
Five configuration options are available to users to influence Tor's
circuit building. EntryNodes and ExitNodes define a list of nodes
that are for the Entry/Exit position in all circuits. ExcludeNodes
is a list of nodes that are used for no circuit, and
ExcludeExitNodes is a list of nodes that aren't used as the last
hop. StrictNodes defines Tor's behaviour in case of a conflict, for
example when a node that is excluded is the only available
introduction point. Setting StrictNodes to 1 breaks Tor's
functionality in that case, and it will refuse to build such a
circuit.
Neither Nick's email nor bug 1090 have clear suggestions how we
should behave in each case, so I tried to come up with something
that made sense to me.
Security implications:
Deviating from normal circuit building can break one's anonymity, so
the documentation of the above option should contain a warning to
make users aware of the pitfalls.
Specification:
It is proposed that the "User configuration" part of path-spec
(section 2.2.2) be replaced with this:
Users can alter the default behavior for path selection with
configuration options. In case of conflicts (excluding and requiring
the same node) the "StrictNodes" option is used to determine
behaviour. If a nodes is both excluded and required via a
configuration option, the exclusion takes preference.
- If "ExitNodes" is provided, then every request requires an exit
node on the ExitNodes list. If a request is supported by no nodes
on that list, and "StrictNodes" is false, then Tor treats that
request as if ExitNodes were not provided.
- "EntryNodes" behaves analogously.
- If "ExcludeNodes" is provided, then no circuit uses any of the
nodes listed. If a circuit requires an excluded node to be used,
and "StrictNodes" is false, then Tor uses the node in that
position while not using any other of the excluded nodes.
- If "ExcludeExitNodes" is provided, then Tor will not use the nodes
listed for the exit position in a circuit. If a circuit requires
an excluded node to be used in the exit position and "StrictNodes"
is false, then Tor builds that circuit as if ExcludeExitNodes were
not provided.
- If a user tries to connect to or resolve a hostname of the form
<target>.<servername>.exit and the "AllowDotExit" configuration
option is set to 1, the request is rewritten to a request for
<target>, and the request is only supported by the exit whose
nickname or fingerprint is <servername>. If "AllowDotExit" is set
to 0 (default), any request for <anything>.exit is denied.
- When any of the *Nodes settings are changed, all circuits are
expired immediately, to prevent a situation where a previously
built circuit is used even though some of its nodes are now
excluded.
Compatibility:
The old Strict*Nodes options are deprecated, and the StrictNodes
option is new. Tor users may need to update their configuration file.

View File

@ -1,138 +0,0 @@
Filename: 172-circ-getinfo-option.txt
Title: GETINFO controller option for circuit information
Author: Damian Johnson
Created: 03-June-2010
Status: Accepted
Overview:
This details an additional GETINFO option that would provide information
concerning a relay's current circuits.
Motivation:
The original proposal was for connection related information, but Jake make
the excellent point that any information retrieved from the control port
is...
1. completely ineffectual for auditing purposes since either (a) these
results can be fetched from netstat already or (b) the information would
only be provided via tor and can't be validated.
2. The more useful uses for connection information can be achieved with
much less (and safer) information.
Hence the proposal is now for circuit based rather than connection based
information. This would strip the most controversial and sensitive data
entirely (ip addresses, ports, and connection based bandwidth breakdowns)
while still being useful for the following purposes:
- Basic Relay Usage Questions
How is the bandwidth I'm contributing broken down? Is it being evenly
distributed or is someone hogging most of it? Do these circuits belong to
the hidden service I'm running or something else? Now that I'm using exit
policy X am I desirable as an exit, or are most people just using me as a
relay?
- Debugging
Say a relay has a restrictive firewall policy for outbound connections,
with the ORPort whitelisted but doesn't realize that tor needs random high
ports. Tor would report success ("your orport is reachable - excellent")
yet the relay would be nonfunctional. This proposed information would
reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good
indicator of what's wrong.
- Visualization
A nice benefit of visualizing tor's behavior is that it becomes a helpful
tool in puzzling out how tor works. For instance, tor spawns numerous
client connections at startup (even if unused as a client). As a newcomer
to tor these asymmetric (outbound only) connections mystified me for quite
a while until until Roger explained their use to me. The proposed
TYPE_FLAGS would let controllers clearly label them as being client
related, making their purpose a bit clearer.
At the moment connection data can only be retrieved via commands like
netstat, ss, and lsof. However, providing an alternative via the control
port provides several advantages:
- scrubbing for private data
Raw connection data has no notion of what's sensitive and what is
not. The relay's flags and cached consensus can be used to take
educated guesses concerning which connections could possibly belong
to client or exit traffic, but this is both difficult and inaccurate.
Anything provided via the control port can scrubbed to make sure we
aren't providing anything we think relay operators should not see.
- additional information
All connection querying commands strictly provide the ip address and
port of connections, and nothing else. However, for the uses listed
above the far more interesting attributes are the circuit's type,
bandwidth usage and uptime.
- improved performance
Querying connection data is an expensive activity, especially for
busy relays or low end processors (such as mobile devices). Tor
already internally knows its circuits, allowing for vastly quicker
lookups.
- cross platform capability
The connection querying utilities mentioned above not only aren't
available under Windows, but differ widely among different *nix
platforms. FreeBSD in particular takes a very unique approach,
dropping important options from netstat and assigning ss to a
spreadsheet application instead. A controller interface, however,
would provide a uniform means of retrieving this information.
Security Implications:
This is an open question. This proposal lacks the most controversial pieces
of information (ip addresses and ports) and insight into potential threats
this would pose would be very welcomed!
Specification:
The following addition would be made to the control-spec's GETINFO section:
"rcirc/id/<Circuit identity>" -- Provides entry for the associated relay
circuit, formatted as:
CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag>
READ=<bytes> WRITE=<bytes>
none of the parameters contain whitespace, and additional results must be
ignored to allow for future expansion. Parameters are defined as follows:
CIRC_ID - Unique numeric identifier for the circuit this belongs to.
CREATED - Unix timestamp (as seconds since the Epoch) for when the
circuit was created.
UPDATED - Unix timestamp for when this information was last updated.
TYPE - Single character flags indicating attributes in the circuit:
(E)ntry : has a connection that doesn't belong to a known Tor server,
indicating that this is either the first hop or bridged
E(X)it : has been used for at least one exit stream
(R)elay : has been extended
Rende(Z)vous : is being used for a rendezvous point
(I)ntroduction : is being used for a hidden service introduction
(N)one of the above: none of the above have happened yet.
READ - Total bytes transmitted toward the exit over the circuit.
WRITE - Total bytes transmitted toward the client over the circuit.
"rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by
newlines.
The following would be included for circ info update events.
4.1.X. Relay circuit status changed
The syntax is:
"650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP
Read SP Write] CRLF
Notice =
"NEW" / ; first information being provided for this circuit
"UPDATE" / ; update for a previously reported circuit
"CLOSED" ; notice that the circuit no longer exists
Notice indicating that queryable information on a relay related circuit has
changed. If the Notice parameter is either "NEW" or "UPDATE" then this
provides the same fields that would be given by calling "GETINFO rcirc/id/"
with the CircID.

View File

@ -1,101 +0,0 @@
Filename: 173-getinfo-option-expansion.txt
Title: GETINFO Option Expansion
Author: Damian Johnson
Created: 02-June-2010
Status: Accepted
Overview:
Over the course of developing arm there's been numerous hacks and
workarounds to gleam pieces of basic, desirable information about the tor
process. As per Roger's request I've compiled a list of these pain points
to try and improve the control protocol interface.
Motivation:
The purpose of this proposal is to expose additional process and relay
related information that is currently unavailable in a convenient,
dependable, and/or platform independent way. Examples of this are...
- The relay's total contributed bandwidth. This is a highly requested
piece of information and, based on the following patch from pipe, looks
trivial to include.
http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html
- The process ID of the tor process. There is a high degree of guess work
in obtaining this. Arm for instance uses pidof, netstat, and ps yet
still fails on some platforms, and Orbot recently got a ticket about
its own attempt to fetch it with ps:
https://trac.torproject.org/projects/tor/ticket/1388
This just includes the pieces of missing information I've noticed
(suggestions or questions of their usefulness are welcome!).
Security Implications:
None that I'm aware of. From a security standpoint this seems decently
innocuous.
Specification:
The following addition would be made to the control-spec's GETINFO section:
"relay/bw-limit" -- Effective relayed bandwidth limit.
"relay/burst-limit" -- Effective relayed burst limit.
"relay/read-total" -- Total bytes relayed (download).
"relay/write-total" -- Total bytes relayed (upload).
"relay/flags" -- Space separated listing of flags currently held by the
relay as repored by the currently cached consensus.
"process/user" -- Username under which the tor process is running,
providing an empty string if none exists.
"process/pid" -- Process id belonging to the main tor process, -1 if none
exists for the platform.
"process/uptime" -- Total uptime of the tor process (in seconds).
"process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD
signal, in seconds).
"process/descriptors-used" -- Count of file descriptors used.
"process/descriptor-limit" -- File descriptor limit (getrlimit results).
"ns/authority" -- Router status info (v2 directory style) for all
recognized directory authorities, joined by newlines.
"state/names" -- A space-separated list of all the keys supported by this
version of Tor's state.
"state/val/<key>" -- Provides the current state value belonging to the
given key. If undefined, this provides the key's default value.
"status/ports-seen" -- A summary of which ports we've seen connections
circuits connect to recently, formatted the same as the EXITS_SEEN status
event described in Section 4.1.XX. This GETINFO option is currently
available only for exit relays.
4.1.XX. Per-port exit stats
The syntax is:
"650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF
We just generated a new summary of which ports we've seen exiting circuits
connecting to recently. The controller could display this for the user, e.g.
in their "relay" configuration window, to give them a sense of how they're
being used (popularity of the various ports they exit to). Currently only
exit relays will receive this event.
TimeStarted is a quoted string indicating when the reported summary
counts from (in GMT).
The PortSummary keyword has as its argument a comma-separated, possibly
empty set of "port=count" pairs. For example (without linebreak),
650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43"
PortSummary=80=16,443=8

View File

@ -1,242 +0,0 @@
Filename: 174-optimistic-data-server.txt
Title: Optimistic Data for Tor: Server Side
Author: Ian Goldberg
Created: 2-Aug-2010
Status: Open
Overview:
When a SOCKS client opens a TCP connection through Tor (for an HTTP
request, for example), the query latency is about 1.5x higher than it
needs to be. Simply, the problem is that the sequence of data flows
is this:
1. The SOCKS client opens a TCP connection to the OP
2. The SOCKS client sends a SOCKS CONNECT command
3. The OP sends a BEGIN cell to the Exit
4. The Exit opens a TCP connection to the Server
5. The Exit returns a CONNECTED cell to the OP
6. The OP returns a SOCKS CONNECTED notification to the SOCKS client
7. The SOCKS client sends some data (the GET request, for example)
8. The OP sends a DATA cell to the Exit
9. The Exit sends the GET to the server
10. The Server returns the HTTP result to the Exit
11. The Exit sends the DATA cells to the OP
12. The OP returns the HTTP result to the SOCKS client
Note that the Exit node knows that the connection to the Server was
successful at the end of step 4, but is unable to send the HTTP query to
the server until step 9.
This proposal (as well as its upcoming sibling concerning the client
side) aims to reduce the latency by allowing:
1. SOCKS clients to optimistically send data before they are notified
that the SOCKS connection has completed successfully
2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
state
3. Exit nodes to accept and queue DATA cells while in the
EXIT_CONN_STATE_CONNECTING state
This particular proposal deals with #3.
In this way, the flow would be as follows:
1. The SOCKS client opens a TCP connection to the OP
2. The SOCKS client sends a SOCKS CONNECT command, followed immediately
by data (such as the GET request)
3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA
cells
4. The Exit opens a TCP connection to the Server
5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET
request to the Server
6. The OP returns a SOCKS CONNECTED notification to the SOCKS client,
and the Server returns the HTTP result to the Exit
7. The Exit sends the DATA cells to the OP
8. The OP returns the HTTP result to the SOCKS client
Motivation:
This change will save one OP<->Exit round trip (down to one from two).
There are still two SOCKS Client<->OP round trips (negligible time) and
two Exit<->Server round trips. Depending on the ratio of the
Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
decrease the latency by 25 to 50 percent. Experiments validate these
predictions. [Goldberg, PETS 2010 rump session; see
https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]
Design:
The current code actually correctly handles queued data at the Exit; if
there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data
will be immediately sent when the connection succeeds. If the
connection fails, the data will be correctly ignored and freed. The
problem with the current server code is that the server currently
drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state.
Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state,
bad things happen because streams in that state don't yet have
conn->write_event set, and so some existing sanity checks (any stream
with queued data is at least potentially writable) are no longer sound.
The solution is to simply not drop received DATA cells while in the
EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this
state, so that the OP cannot send more than one window's worth of data
to be queued at the Exit. Finally, patch the sanity checks so that
streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data
can pass.
If no clients ever send such optimistic data, the new code will never be
executed, and the behaviour of Tor will not change. When clients begin
to send optimistic data, the performance of those clients' streams will
improve.
After discussion with nickm, it seems best to just have the server
version number be the indicator of whether a particular Exit supports
optimistic data. (If a client sends optimistic data to an Exit which
does not support it, the data will be dropped, and the client's request
will fail to complete.) What do version numbers for hypothetical future
protocol-compatible implementations look like, though?
Security implications:
Servers (for sure the Exit, and possibly others, by watching the
pattern of packets) will be able to tell that a particular client
is using optimistic data. This will be discussed more in the sibling
proposal.
On the Exit side, servers will be queueing a little bit extra data, but
no more than one window. Clients today can cause Exits to queue that
much data anyway, simply by establishing a Tor connection to a slow
machine, and sending one window of data.
Specification:
tor-spec section 6.2 currently says:
The OP waits for a RELAY_CONNECTED cell before sending any data.
Once a connection has been established, the OP and exit node
package stream data in RELAY_DATA cells, and upon receiving such
cells, echo their contents to the corresponding TCP stream.
RELAY_DATA cells sent to unrecognized streams are dropped.
It is not clear exactly what an "unrecognized" stream is, but this last
sentence would be changed to say that RELAY_DATA cells received on a
stream that has processed a RELAY_BEGIN cell and has not yet issued a
RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed
immediately after a RELAY_CONNECTED cell is issued for the stream, or
freed after a RELAY_END cell is issued for the stream.
The earlier part of this section will be addressed in the sibling
proposal.
Compatibility:
There are compatibility issues, as mentioned above. OPs MUST NOT send
optimistic data to Exit nodes whose version numbers predate (something).
OPs MAY send optimistic data to Exit nodes whose version numbers match
or follow that value. (But see the question about independent server
reimplementations, above.)
Implementation:
Here is a simple patch. It seems to work with both regular streams and
hidden services, but there may be other corner cases I'm not aware of.
(Do streams used for directory fetches, hidden services, etc. take a
different code path?)
diff --git a/src/or/connection.c b/src/or/connection.c
index 7b1493b..f80cd6e 100644
--- a/src/or/connection.c
+++ b/src/or/connection.c
@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len,
return;
}
- connection_start_writing(conn);
+ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING
+ * state, we don't want to try to write it right away, since
+ * conn->write_event won't be set yet. Otherwise, write data from
+ * this conn as the socket is available. */
+ if (conn->state != EXIT_CONN_STATE_RESOLVING) {
+ connection_start_writing(conn);
+ }
if (zlib) {
conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen;
} else {
@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now)
tor_assert(conn->s < 0);
if (conn->outbuf_flushlen > 0) {
- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw ||
+ /* With optimistic data, we may have queued data in
+ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing.
+ * */
+ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING ||
+ connection_is_writing(conn) || conn->write_blocked_on_bw ||
(CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ));
}
diff --git a/src/or/relay.c b/src/or/relay.c
index fab2d88..e45ff70 100644
--- a/src/or/relay.c
+++ b/src/or/relay.c
@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
relay_header_t rh;
unsigned domain = layer_hint?LD_APP:LD_EXIT;
int reason;
+ int optimistic_data = 0; /* Set to 1 if we receive data on a stream
+ that's in the EXIT_CONN_STATE_RESOLVING
+ or EXIT_CONN_STATE_CONNECTING states.*/
tor_assert(cell);
tor_assert(circ);
@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
/* either conn is NULL, in which case we've got a control cell, or else
* conn points to the recognized stream. */
- if (conn && !connection_state_is_open(TO_CONN(conn)))
- return connection_edge_process_relay_cell_not_open(
- &rh, cell, circ, conn, layer_hint);
+ if (conn && !connection_state_is_open(TO_CONN(conn))) {
+ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING ||
+ conn->_base.state == EXIT_CONN_STATE_RESOLVING) &&
+ rh.command == RELAY_COMMAND_DATA) {
+ /* We're going to allow DATA cells to be delivered to an exit
+ * node in state EXIT_CONN_STATE_CONNECTING or
+ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */
+ log_warn(domain, "Optimistic data received.");
+ optimistic_data = 1;
+ } else {
+ return connection_edge_process_relay_cell_not_open(
+ &rh, cell, circ, conn, layer_hint);
+ }
+ }
switch (rh.command) {
case RELAY_COMMAND_DROP:
@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
log_debug(domain,"circ deliver_window now %d.", layer_hint ?
layer_hint->deliver_window : circ->deliver_window);
- circuit_consider_sending_sendme(circ, layer_hint);
+ if (!optimistic_data) {
+ circuit_consider_sending_sendme(circ, layer_hint);
+ }
if (!conn) {
log_info(domain,"data cell dropped, unknown stream (streamid %d).",
@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
stats_n_data_bytes_received += rh.length;
connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE,
rh.length, TO_CONN(conn));
- connection_edge_consider_sending_sendme(conn);
+ if (!optimistic_data) {
+ connection_edge_consider_sending_sendme(conn);
+ }
return 0;
case RELAY_COMMAND_END:
reason = rh.length > 0 ?
Performance and scalability notes:
There may be more RAM used at Exit nodes, as mentioned above, but it is
transient.

View File

@ -1,39 +0,0 @@
Notes on an auto updater:
steve wants a "latest" symlink so he can always just fetch that.
roger worries that this will exacerbate the "what version are you
using?" "latest." problem.
weasel suggests putting the latest recommended version in dns. then
we don't have to hit the website. it's got caching, it's lightweight,
it scales. just put it in a TXT record or something.
but, no dnssec.
roger suggests a file on the https website that lists the latest
recommended version (or filename or url or something like that).
(steve seems to already be doing this with xerobank. he additionally
suggests a little blurb that can be displayed to the user to describe
what's new.)
how to verify you're getting the right file?
a) it's https.
b) ship with a signing key, and use some openssl functions to verify.
c) both
andrew reminds us that we have a "recommended versions" line in the
consensus directory already.
if only we had some way to point out the "latest stable recommendation"
from this list. we could list it first, or something.
the recommended versions line also doesn't take into account which
packages are available -- e.g. on Windows one version might be the best
available, and on OS X it might be a different one.
aren't there existing solutions to this? surely there is a beautiful,
efficient, crypto-correct auto updater lib out there. even for windows.

View File

@ -1,174 +0,0 @@
How to hand out bridges.
Divide bridges into 'strategies' as they come in. Do this uniformly
at random for now.
For each strategy, we'll hand out bridges in a different way to
clients. This document describes two strategies: email-based and
IP-based.
0. Notation:
HMAC(k,v) : an HMAC of v using the key k.
A|B: The string A concatenated with the string B.
1. Email-based.
Goal: bootstrap based on one or more popular email service's sybil
prevention algorithms.
Parameters:
HMAC -- an HMAC function
P -- a time period
K -- the number of bridges to send in a period.
Setup: Generate two nonces, N and M.
As bridges arrive, put them into a ring according to HMAC(N,ID)
where ID is the bridges's identity digest.
Divide time into divisions of length P.
When we get an email:
If it's not from a supported email service, reject it.
If we already sent a response to that email address (normalized)
in this period, send _exactly_ the same response.
If it is from a supported service, generate X = HMAC(M,PS|E) where E
is the lowercased normalized email address for the user, and
where PS is the start of the currrent period. Send
the first K bridges in the ring after point X.
[If we want to make sure that repeat queries are given exactly the
same results, then we can't let the ring change during the
time period. For a long time period like a month, that's quite a
hassle. How about instead just keeping a replay cache of addresses
that have been answered, and sending them a "sorry, you already got
your addresses for the time period; perhaps you should try these
other fine distribution strategies while you wait?" response? This
approach would also resolve the "Make sure you can't construct a
distinct address to match an existing one" note below. -RD]
[I think, if we get a replay, we need to send back the same
answer as we did the first time, not say "try again."
Otherwise we need to worry that an attacker can keep people
from getting bridges by preemtively asking for them,
or that an attacker may force them to prove they haven't
gotten any bridges by asking. -NM]
[While we're at it, if we do the replay cache thing and don't need
repeatable answers, we could just pick K random answers from the
pool. Is it beneficial that a bridge user who knows about a clump of
nodes will be sharing them with other users who know about a similar
(overlapping) clump? One good aspect is against an adversary who
learns about a clump this way and watches those bridges to learn
other users and discover *their* bridges: he doesn't learn about
as many new bridges as he might if they were randomly distributed.
A drawback is against an adversary who happens to pick two email
addresses in P that include overlapping answers: he can measure
the difference in clumps and estimate how quickly the bridge pool
is growing. -RD]
[Random is one more darn thing to implement; rings are already
there. -NM]
[If we make the period P be mailbox-specific, and make it a random
value around some mean, then we make it harder for an attacker to
know when to try using his small army of gmail addresses to gather
another harvest. But we also make it harder for users to know when
they can try again. -RD]
[Letting the users know about when they can try again seems
worthwhile. Otherwise users and attackers will all probe and
probe and probe until they get an answer. No additional
security will be achieved, but bandwidth will be lost. -NM]
To normalize an email address:
Start with the RFC822 address. Consider only the mailbox {???}
portion of the address (username@domain). Put this into lowercase
ascii.
Questions:
What to do with weird character encodings? Look up the RFC.
Notes:
Make sure that you can't force a single email address to appear
in lots of different ways. IOW, if nickm@freehaven.net and
NICKM@freehaven.net aren't treated the same, then I can get lots
more bridges than I should.
Make sure you can't construct a distinct address to match an
existing one. IOW, if we treat nickm@X and nickm@Y as the same
user, then anybody can register nickm@Z and use it to tell which
bridges nickm@X got (or would get).
Make sure that we actually check headers so we can't be trivially
used to spam people.
2. IP-based.
Goal: avoid handing out all the bridges to users in a similar IP
space and time.
Parameters:
T_Flush -- how long it should take a user on a single network to
see a whole cluster of bridges.
N_C
K -- the number of bridges we hand out in response to a single
request.
Setup: using an AS map or a geoip map or some other flawed input
source, divide IP space into "areas" such that surveying a large
collection of "areas" is hard. For v0, use /24 address blocks.
Group areas into N_C clusters.
Generate secrets L, M, N.
Set the period P such that P*(bridges-per-cluster/K) = T_flush.
Don't set P to greater than a week, or less than three hours.
When we get a bridge:
Based on HMAC(L,ID), assign the bridge to a cluster. Within each
cluster, keep the bridges in a ring based on HMAC(M,ID).
[Should we re-sort the rings for each new time period, so the ring
for a given cluster is based on HMAC(M,PS|ID)? -RD]
When we get a connection:
If it's http, redirect it to https.
Let area be the incoming IP network. Let PS be the current
period. Compute X = HMAC(N, PS|area). Return the next K bridges
in the ring after X.
[Don't we want to compute C = HMAC(key, area) to learn what cluster
to answer from, and then X = HMAC(key, PS|area) to pick a point in
that ring? -RD]
Need to clarify that some HMACs are for rings, and some are for
partitions. How rings scale is clear. How do we grow the number of
partitions? Looking at successive bits from the HMAC output is one way.
3. Open issues
Denial of service attacks
A good view of network topology
at some point we should learn some reliability stats on our bridges. when
we say above 'give out k bridges', we might give out 2 reliable ones and
k-2 others. we count around the ring the same way we do now, to find them.

View File

@ -1,106 +0,0 @@
# The following two algorithms
# Algorithm 1
# TODO: Burst and Relay/Regular differentiation
BwRate = Bandwidth Rate in Bytes Per Second
GlobalWriteBucket = 0
GlobalReadBucket = 0
Epoch = Token Fill Rate in seconds: suggest 50ms=.050
SecondCounter = 0
MinWriteBytes = Minimum amount bytes per write
Every Epoch Seconds:
UseMinWriteBytes = MinWriteBytes
WriteCnt = 0
ReadCnt = 0
BytesRead = 0
For Each Open OR Conn with pending write data:
WriteCnt++
For Each Open OR Conn:
ReadCnt++
BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt
BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt
if BwRate/WriteCnt < MinWriteBytes:
# If we aren't likely to accumulate enough bytes in a second to
# send a whole cell for our connections, send partials
Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.")
UseMinWriteBytes = 1
# Other option: We could switch to plan 2 here
# Service each writable ORConn. If there are any partial writes,
# return remaining bytes from this epoch to the global pool
For Each Open OR Conn with pending write data:
ORConn->write_bucket += BytesToWrite
if ORConn->write_bucket > UseMinWriteBytes:
w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket))
# possible that w < ORConn->write_data here due to TCP pushback.
# We should restore the rest of the write_bucket to the global
# buffer
GlobalWriteBucket += (ORConn->write_bucket - w)
ORConn->write_bucket = 0
For Each Open OR Conn:
r = read_nonblock(ORConn, BytesToRead)
BytesRead += r
SecondCounter += Epoch
if SecondCounter < 1:
# Save unused bytes from this epoch to be used later in the second
GlobalReadBucket += (BwRate*Epoch - BytesRead)
else:
SecondCounter = 0
GlobalReadBucket = 0
GlobalWriteBucket = 0
For Each ORConn:
ORConn->write_bucket = 0
# Alternate plan for Writing fairly. Reads would still be covered
# by plan 1 as there is no additional network overhead for short reads,
# so we don't need to try to avoid them.
#
# I think this is actually pretty similar to what we do now, but
# with the addition that the bytes accumulate up to the second mark
# and we try to keep track of our position in the write list here
# (unless libevent is doing that for us already and I just don't see it)
#
# TODO: Burst and Relay/Regular differentiation
# XXX: The inability to send single cells will cause us to block
# on EXTEND cells for low-bandwidth node pairs..
BwRate = Bandwidth Rate in Bytes Per Second
WriteBytes = Bytes per write
Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s)
SecondCounter = 0
GlobalWriteBucket = 0
# New connections are inserted at Head-1 (the 'tail' of this circular list)
# This is not 100% fifo for all node data, but it is the best we can do
# without insane amounts of additional queueing complexity.
WriteConnList = List of Open OR Conns with pending write data > WriteBytes
WriteConnHead = 0
Every Epoch Seconds:
GlobalWriteBucket += BwRate*Epoch
WriteListEnd = WriteConnHead
do
ORCONN = WriteConnList[WriteConnHead]
w = write(ORConn, WriteBytes)
GlobalWriteBucket -= w
WriteConnHead += 1
while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd
SecondCounter += Epoch
if SecondCounter >= 1:
SecondCounter = 0
GlobalWriteBucket = 0

Some files were not shown because too many files have changed in this diff Show More