fuzzytags-book/src/statistical-attacks.md

# Statistical Attacks

Using some basic binomial probability we can use the false positive rate of reach receiver tag to calculate
the probability of matching on at least X tags given the false positive rate. Using this we can find statistically
unlikely matches e.g. a low-false positive key matching many tags in a given period.

This can be used to find receivers who likely received messages in a given period. This attack works regardless
of how many parties are downloading everything, and is entirely dependent on the receiver choosing $p$ that is
suboptimal for the number of messages they receive.

## Deriving the Social Graph

If it is possible to group tags by sender then we can perform a slightly better attack and ultimately learn the
underlying social graph with fairly low false positive rates (in [simulations](./simulations.md) we can learn 5-10% of
the underlying connections with between 5-12% false positive rates.)

The method is the same as above, but look at the probability that a party would have matched at least X tags _from a
specific sender_ given their false positive rate.

Notably, this latter attack reveals something important about choosing parameters for fuzzytags - _p_ must be chosen
to take into account not just total number of messages, but total number of messages from a given source (or it
must be assumed that the server will never be able to isolate tags from a given sender)