fuzzytags-book/src/simulation-college-im.md

35 lines
1.6 KiB
Markdown

In this section we will document simulations performed on the College Msg Core dataset (details below). In particular, we assess the worst-case scenario of a server with access to a sender-oracle (i.e. able to attribute tags to a
particular sender) to understand how much information is leaked by fuzzytags without [appropriate deployment mitigations.](./deploying-fuzzytags.md)
# College IM Dataset Simulations
Nodes 1899
Temporal Edges 59835
Time span 193 days
Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.
![](./simulations/college-actual.jpeg)
## Scenario 1
Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. No entangling.
Result: Server can identify ~4.3% of original graph (313 links) with a 12% false positive rate at threshold: 0.0001.
![](./simulations/college-derived.jpeg)
## Scenario 2
Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. Every tag entangled to one random node (as before).
Result: Server can identify ~3.95% of original graph (290 links) with a ~15% false positive rate.
![](./simulations/college-derived-entangled.jpeg)
# Discussion
A very similar result to our observations on the EU Core email dataset, entangled tags increase the false positive
rate, although overall it requires non-naive entangling strategies to push the false positive rate of the derived graph
to a place where it would not be useful for an adversary.