fuzzytags-book/src/simulation-eu-core-email.md

In this section we will document simulations performed on the Email EU Core dataset (details below). In particular,
we assess the worst-case scenario of a server with access to a sender-oracle (i.e. able to attribute tags to a
particular sender) to understand how much information is leaked by fuzzytags without [appropriate deployment mitigations.](./deploying-fuzzytags.md)

# Email EU Core Dataset Simulations

    Nodes:              1004
    Temporal Edges: 	332334
    Time span:          803 days

Citation: Ashwin Paranjape, Austin R. Benson, and Jure Leskovec. "Motifs in Temporal Networks." In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017.


![](./simulations/eu-actual.jpeg)

## Scenario 1

Setup: 1 month of email events between 1004 nodes, 20k events (5148 links). False positive rates: \[0.007812, 0.5\]. No entangling.

Result: An adversarial server can identify ~7% of original graph (393 links) with a 6% false positive rate. Threshold: 0.0001

![](./simulations/eu-derived.jpeg)

## Scenario 2

Setup: The same month of emails, 20k events (5148 links) Same false positive rates. Every tag is entangled with 1 random node.

Result: Server can identify ~6.6% of original graph with a 6.8% false positive rate.

![](./simulations/eu-derived-entangled.jpeg)

# Discussion

Entanglement seems to have some impact on the servers ability to relearn the social graph, in particular
it increases the false positive rate of the derived graph. However, this impact is not significant enough in the
observed simulation.