57 lines
3.2 KiB
Markdown
57 lines
3.2 KiB
Markdown
|
## Deploying FuzzyTags Securely
|
||
|
|
||
|
The properties provided by this system are highly dependent on selecting a false positive rate _p_. In the following
|
||
|
sections we will cover a number of considerations you should take into account when integrating fuzzytags into a larger
|
||
|
privacy preserving application.
|
||
|
|
||
|
### How bad is it to let people select their own false-positive rates?
|
||
|
|
||
|
The short answer is "it depends".
|
||
|
|
||
|
The longer answer:
|
||
|
|
||
|
When different parties have different false positive rates the server can calculate the skew between a party's ideal
|
||
|
false positive rate and observed false positive rate.
|
||
|
|
||
|
That skew leaks information, especially given certain message distributions. Specifically it leaks parties
|
||
|
who receive a larger proportion of system messages than their ideal false positive rate.
|
||
|
|
||
|
i.e. for low false positive rates and high message volume for a specific receiver, the adversarial server
|
||
|
can calculate a skew that leaks the recipient of individual messages - breaking privacy for that receiver.
|
||
|
|
||
|
It *also* removes those messages from the pool of messages that an adversarial server needs to consider for other receivers.
|
||
|
Effectively reducing the anonymity set for everyone else.
|
||
|
|
||
|
### Should Senders use an anonymous communication network?
|
||
|
|
||
|
If statistical & differential attacks are likely e.g. few parties download everything and
|
||
|
multiple messages are expected to originate from a sender to a receiver or there
|
||
|
is other information that might otherwise link a set of messages to a sender or receiver then you may want to consider how
|
||
|
to remove that context.
|
||
|
|
||
|
One potential way of removing context is by having senders send their message to the server through some kind of anonymous
|
||
|
communication network e.g. a mixnet or tor.
|
||
|
|
||
|
Be warned: This may not eliminate all the context!
|
||
|
|
||
|
### How bad is it to select a poor choice of _p_?
|
||
|
|
||
|
Consider a _pareto distribution_ where most users only receive a few messages, and small subset of users
|
||
|
receive a large number of messages it seems that increasing the number of parties is
|
||
|
generally more important to overall anonymity of the system than any individual selection of _p_.
|
||
|
|
||
|
Under a certain threshold of parties, trivial breaks (i.e. tags that only match to a single party) are a bigger concern.
|
||
|
|
||
|
Assuming we have large number of parties (_N_), the following heuristic emerges:
|
||
|
|
||
|
* Parties who only expect to receive a small number of messages can safely choose smaller false positive rates, up
|
||
|
to a threshold _θ_, where _θ > 2^-N_. The lower the value of _θ_ the greater the possibility of random trivial breaks for
|
||
|
the party.
|
||
|
* Parties who expect a large number of messages should choose to receive **all** messages for 2 reasons:
|
||
|
1) Even high false positive rates for power users result in information leaks to the server (due to the large
|
||
|
skew) i.e. a server can trivially learn what users are power users.
|
||
|
2) By choosing to receive all messages, power users don't sacrifice much in terms of bandwidth, but will provide
|
||
|
cover for parties who receive a small number of messages and who want a lower false-positive rate.
|
||
|
|
||
|
(We consider a pareto distribution here because we expect many applications to have parties that can be
|
||
|
modelled as such - especially over short-time horizons)
|