In this section we will document simulations performed on the College Msg Core dataset (details below). In particular, we assess the worst-case scenario of a server with access to a sender-oracle (i.e. able to attribute tags to a particular sender) to understand how much information is leaked by fuzzytags without [appropriate deployment mitigations.](./deploying-fuzzytags.md) # College IM Dataset Simulations Nodes 1899 Temporal Edges 59835 Time span 193 days Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932. ![](./simulations/college-actual.jpeg) ## Scenario 1 Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. No entangling. Result: Server can identify ~4.3% of original graph (313 links) with a 12% false positive rate at threshold: 0.0001. ![](./simulations/college-derived.jpeg) ## Scenario 2 Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. Every tag entangled to one random node (as before). Result: Server can identify ~3.95% of original graph (290 links) with a ~15% false positive rate. ![](./simulations/college-derived-entangled.jpeg) # Discussion A very similar result to our observations on the EU Core email dataset, entangled tags increase the false positive rate, although overall it requires non-naive entangling strategies to push the false positive rate of the derived graph to a place where it would not be useful for an adversary.