Updating Attacks

This commit is contained in:
Sarah Jamie Lewis 2021-02-19 10:30:42 -08:00
parent dca0792928
commit c5a96de77d
5 changed files with 50 additions and 29 deletions

View File

@ -2,6 +2,9 @@
- [Introduction](introduction.md)
- [Terminology](./terminology.md)
- [Risk Model](./risk-model.md)
- [Intersection Attacks](./intersection-attacks.md)
- [Statistical Attacks](./statistical-attacks.md)
- [Deploying Fuzzytags Securely](./deploying-fuzzytags.md)
- [Entangled Tags](./entangled.md)
- [Entangling Strategies](./entangling-strategies.md)

View File

@ -22,35 +22,6 @@ i.e. for low false positive rates and high message volume for a specific receive
It *also* removes those messages from the pool of messages that an adversarial server needs to consider for other receivers.
Effectively reducing the anonymity set for everyone else.
Which brings us onto:
### Differential Attacks
Any kind of differential attacks break this scheme, even for a small number of messages i.e. if you learn (through
any means) that a specific set of messages are all likely for 1 party, you can diff them against all other parties keys and
very quickly isolate the intended recipient - in simulations of 100-1000 parties it can take as little as 3 messages - even
with everyone selecting fairly high false positive rates.
The corollary of the above being that in differential attacks your anonymity set is basically the number of users
who download all messages - since you can't diff them. This has the interesting side effect: the more parties who
download everything, the more the system can safely tolerate parties with small false-positive rates.
To what extent you can actually account for this in your application is an open question.
### Statistical Attacks
Using some basic binomial probability we can use the false positive rate of reach receiver tag to calculate
the probability of matching on at least X tags given the false positive rate. Using this we can find statistically
unlikely matches e.g. a low-false positive key matching many tags in a given period.
This can be used to find receivers who likely received messages in a given period.
If it is possible to group tags by sender then we can perform a slightly better attack and ultimately learn the
underlying social graph with fairly low false positive rates (in simulations we can learn 5-10% of the underlying
connections with between 5-12% false positive rates.)
For more information on statistical attacks please check out our [fuzzytags simulator](https://git.openprivacy.ca/openprivacy/fuzzytags-sim).
### Should Senders use an anonymous communication network?
If statistical & differential attacks are likely e.g. few parties download everything and

View File

@ -0,0 +1,14 @@
# Intersection Attacks
One of the most basic attacks primitives for any kind of false positive based scheme is an intersection attack, where
the set of peers that match one tag is intersected against another set of peets that matches a different tag.
Any kind of intersection attacks break this scheme, even for a small number of messages i.e. if you learn (through
any means) that a specific set of messages are all likely for a single party, you can diff them against all other parties keys and
very quickly isolate the intended recipient - in simulations of 100-1000 parties it can take as little as 3 messages - even
with everyone selecting fairly high false positive rates.
The corollary of the above being that in intersection attacks your anonymity set is the number of users
who download all messages. This has the interesting side effect: the more parties who download everything,
the more the system can safely tolerate parties with small false-positive rates.

11
src/risk-model.md Normal file
View File

@ -0,0 +1,11 @@
# Risk Model
In this section we will document the risk model and attacks that will likely impact practical deployments
of fuzzytags.
We assume a set of $n$ parties sending tags and messages through an untrusted server.
For some of the analysis in this notebook we will assume that the server has knowledge of the sender of each tag.
While this is not desirable in real-world deployments, being able to match tags to senders does allow us to derive
bounds on worst case metadata leakage.

View File

@ -0,0 +1,22 @@
# Statistical Attacks
Using some basic binomial probability we can use the false positive rate of reach receiver tag to calculate
the probability of matching on at least X tags given the false positive rate. Using this we can find statistically
unlikely matches e.g. a low-false positive key matching many tags in a given period.
This can be used to find receivers who likely received messages in a given period. This attack works regardless
of how many parties are downloading everything, and is entirely dependent on the receiver choosing $p$ that is
suboptimal for the number of messages they receive.
## Deriving the Social Graph
If it is possible to group tags by sender then we can perform a slightly better attack and ultimately learn the
underlying social graph with fairly low false positive rates (in [simulations](./simulations.md) we can learn 5-10% of
the underlying connections with between 5-12% false positive rates.)
The method is the same as above, but look at the probability that a party would have matched at least X tags _from a
specific sender_ given their false positive rate.
Notably, this latter attack reveals something important about choosing parameters for fuzzytags - _p_ must be chosen
to take into account not just total number of messages, but total number of messages from a given source (or it
must be assumed that the server will never be able to isolate tags from a given sender)