diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 07c3664..453da0a 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -4,6 +4,7 @@ - [Terminology](./terminology.md) - [Deploying Fuzzytags Securely](./deploying-fuzzytags.md) - [Entangled Tags](./entangled.md) + - [Entangling Strategies](./entangling-strategies.md) - [Simulating fuzzytags](./simulations.md) - [Email EU Core Dataset Simulations](./simulation-eu-core-email.md) - [College IM Dataset Simulations](./simulation-college-im.md) \ No newline at end of file diff --git a/src/entangled.md b/src/entangled.md index de047e5..d99642b 100644 --- a/src/entangled.md +++ b/src/entangled.md @@ -7,6 +7,9 @@ This is done through $\texttt{FlagEntangled}$ - a function that takes in a vect $\texttt{Flag}$ function, as documented in the original paper for each of them (with the same $r and z$) until it finds a $z$ that will generate the same tag for every tagging key. +Here we briefly outline a number of properties of entangled tags. In the next section we will consider how to +apply these strategies in real-world deployments. + ## Multiparty Broadcast Alice wants to send a message to Bob and Carol. She constructs a single tag that will validate against detection keys generated by both of them. diff --git a/src/entangling-strategies.md b/src/entangling-strategies.md new file mode 100644 index 0000000..b021a98 --- /dev/null +++ b/src/entangling-strategies.md @@ -0,0 +1,54 @@ +# Entangling Strategies + +In this section we will document a number of different strategies that applications might consider +when implementing an entangling scheme. Where possible we will provide analysis and simulations to assess privacy +impact of each one. + +## No Entangling + +For completeness, let's consider no entangling. In this case each tag has a single recipient. We have shown through +simulations that fuzzytags, in aggregate, leak information to the server. Being able to attribute a fuzzytag to a +receiver with confidence reduces the false positive rate on the derived graph. + +On the other hand, all entangling strategies require access to a global database of tagging keys in order to be +effective - this has concrete privacy and security concerns in-and-of itself. + +## Random Entangling + +One of the simplest schemes involves every sender selecting a single random entangled recipient. We have simulated +this strategy against real world datasets and found that it does little to prevent the server from deriving the underlying +social graph - only marginally increasing the false positive rate of connections on the derived graph. + +To understand why, consider that entangling tags in this way doesn't create a signal intended to corrupt the +derived graph, but instead adds noise to the number of received tags for each detection key. As the number of legitimate +tags received increases, they create a signal that stands above the noise floor of the generated tags. + +## Deniable Entangling + +As outlined in the last section, one of the major properties of entangled tags is deniable sending i.e. they make it +possible to legitimately send a message to multiple parties. + +One possible deployment approach is to have a one or more known popular services using fuzzytags to receive messages, +and for all parties to entangle their messages to both the intended recipient *and* a popular service. + +This approach provides two distinct benefits: + +1. Only the tagging keys of popular services need to be known - these can be public and heavily popularized, discounting most +security and privacy concerns with maintaining a database of keys to entangle to. +2. Every tag in this scheme is individually deniability, increasing the anonymity set of all received tags. +3. Such services provide a bootstrapping mechanism for a network based on fuzzytags + +The one major drawback of this approach centres around the addition of a number of select service providers into the +anonymity model resulting in a psuedo-non-collusion assumption, in addition to raising the spectre of compulsion attacks. + +Both of these drawbacks can be mitigated through diversification of the entanglement pool - at it must be pointed out +that unlike many non-collusion assumptions there is no direct systemic relationships between the adversarial server +providing fuzzytags and services built on top of such a service i.e. anyone can start providing a service and anyone +can start entangling to a service without these being core features to the anonymity system. + +It must be stated that such an entangling scheme does not remove the risk that aggregate tags will reveal the +underlying social network (as the signal is still there), it simply invokes a deniability argument for the relationship. +This may be sufficient, but is not always going to be so. + +As such any simulation of this approach will likely be identical to one for not-entangling at all - with the false +positive result replaced with a deniability disclaimer. diff --git a/src/simulation-college-im.md b/src/simulation-college-im.md index 8ddeba4..fca7798 100644 --- a/src/simulation-college-im.md +++ b/src/simulation-college-im.md @@ -10,18 +10,24 @@ particular sender) to understand how much information is leaked by fuzzytags wit Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932. +![](./simulations/college-actual.jpeg) + ## Scenario 1 Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. No entangling. Result: Server can identify ~4.3% of original graph (313 links) with a 12% false positive rate at threshold: 0.0001. +![](./simulations/college-derived.jpeg) + ## Scenario 2 Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. Every tag entangled to one random node (as before). Result: Server can identify ~3.95% of original graph (290 links) with a ~15% false positive rate. +![](./simulations/college-derived-entangled.jpeg) + # Discussion A very similar result to our observations on the EU Core email dataset, entangled tags increase the false positive diff --git a/src/simulation-eu-core-email.md b/src/simulation-eu-core-email.md index d89b404..2a28e9f 100644 --- a/src/simulation-eu-core-email.md +++ b/src/simulation-eu-core-email.md @@ -10,18 +10,25 @@ particular sender) to understand how much information is leaked by fuzzytags wit Citation: Ashwin Paranjape, Austin R. Benson, and Jure Leskovec. "Motifs in Temporal Networks." In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017. + +![](./simulations/eu-actual.jpeg) + ## Scenario 1 Setup: 1 month of email events between 1004 nodes, 20k events (5148 links). False positive rates: \[0.007812, 0.5\]. No entangling. Result: An adversarial server can identify ~7% of original graph (393 links) with a 6% false positive rate. Threshold: 0.0001 +![](./simulations/eu-derived.jpeg) + ## Scenario 2 Setup: The same month of emails, 20k events (5148 links) Same false positive rates. Every tag is entangled with 1 random node. Result: Server can identify ~6.6% of original graph with a 6.8% false positive rate. +![](./simulations/eu-derived-entangled.jpeg) + # Discussion Entanglement seems to have some impact on the servers ability to relearn the social graph, in particular diff --git a/src/simulations/college-actual.jpeg b/src/simulations/college-actual.jpeg new file mode 100644 index 0000000..649a170 Binary files /dev/null and b/src/simulations/college-actual.jpeg differ diff --git a/src/simulations/college-derived-entangled.jpeg b/src/simulations/college-derived-entangled.jpeg new file mode 100644 index 0000000..67cba33 Binary files /dev/null and b/src/simulations/college-derived-entangled.jpeg differ diff --git a/src/simulations/college-derived.jpeg b/src/simulations/college-derived.jpeg new file mode 100644 index 0000000..7312289 Binary files /dev/null and b/src/simulations/college-derived.jpeg differ diff --git a/src/simulations/eu-actual.jpeg b/src/simulations/eu-actual.jpeg new file mode 100644 index 0000000..ae7fa66 Binary files /dev/null and b/src/simulations/eu-actual.jpeg differ diff --git a/src/simulations/eu-derived-entangled.jpeg b/src/simulations/eu-derived-entangled.jpeg new file mode 100644 index 0000000..6063854 Binary files /dev/null and b/src/simulations/eu-derived-entangled.jpeg differ diff --git a/src/simulations/eu-derived.jpeg b/src/simulations/eu-derived.jpeg new file mode 100644 index 0000000..6586e68 Binary files /dev/null and b/src/simulations/eu-derived.jpeg differ