Screenshots and Initial Entangling Strategies Draft

2021-02-17 15:23:06 -08:00 · 2021-02-17 15:23:06 -08:00 · 4de674c984
parent 538c608a97
commit 4de674c984
11 changed files with 71 additions and 0 deletions
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@ -4,6 +4,7 @@
 - [Terminology](./terminology.md)
 - [Deploying Fuzzytags Securely](./deploying-fuzzytags.md)
 - [Entangled Tags](./entangled.md)
+    - [Entangling Strategies](./entangling-strategies.md)
 - [Simulating fuzzytags](./simulations.md)
    - [Email EU Core Dataset Simulations](./simulation-eu-core-email.md)
    - [College IM Dataset Simulations](./simulation-college-im.md)
--- a/src/entangled.md
+++ b/src/entangled.md
@ -7,6 +7,9 @@ This is done through  $\texttt{FlagEntangled}$ - a function that takes in a vect
 $\texttt{Flag}$ function, as documented in the original paper for each of them (with the same $r and z$) until it
 finds a $z$ that will generate the same tag for every tagging key.

+Here we briefly outline a number of properties of entangled tags. In the next section we will consider how to
+apply these strategies in real-world deployments.
+
 ## Multiparty Broadcast

 Alice wants to send a message to Bob and Carol. She constructs a single tag that will validate against detection keys generated by both of them.
--- a/src/entangling-strategies.md
+++ b/src/entangling-strategies.md
@ -0,0 +1,54 @@
+# Entangling Strategies
+
+In this section we will document a number of different strategies that applications might consider 
+when implementing an entangling scheme. Where possible we will provide analysis and simulations to assess privacy
+impact of each one.
+
+## No Entangling
+
+For completeness, let's consider no entangling. In this case each tag has a single recipient. We have shown through
+simulations that fuzzytags, in aggregate, leak information to the server. Being able to attribute a fuzzytag to a 
+receiver with confidence reduces the false positive rate on the derived graph. 
+
+On the other hand, all entangling strategies require access to a global database of tagging keys in order to be
+effective - this has concrete privacy and security concerns in-and-of itself.
+
+## Random Entangling
+
+One of the simplest schemes involves every sender selecting a single random entangled recipient. We have simulated
+this strategy against real world datasets and found that it does little to prevent the server from deriving the underlying
+social graph - only marginally increasing the false positive rate of connections on the derived graph.
+
+To understand why, consider that entangling tags in this way doesn't create a signal intended to corrupt the 
+derived graph, but instead adds noise to the number of received tags for each detection key. As the number of legitimate
+tags received increases, they create a signal that stands above the noise floor of the generated tags.
+
+## Deniable Entangling
+
+As outlined in the last section, one of the major properties of entangled tags is deniable sending i.e. they make it
+possible to legitimately send a message to multiple parties.
+
+One possible deployment approach is to have a one or more known popular services using fuzzytags to receive messages,
+and for all parties to entangle their messages to both the intended recipient *and* a popular service.
+
+This approach provides two distinct benefits:
+
+1. Only the tagging keys of popular services need to be known - these can be public and heavily popularized, discounting most
+security and privacy concerns with maintaining a database of keys to entangle to.
+2. Every tag in this scheme is individually deniability, increasing the anonymity set of all received tags.
+3. Such services provide a bootstrapping mechanism for a network based on fuzzytags
+
+The one major drawback of this approach centres around the addition of a number of select service providers into the 
+anonymity model resulting in a psuedo-non-collusion assumption, in addition to raising the spectre of compulsion attacks.
+
+Both of these drawbacks can be mitigated through diversification of the entanglement pool - at it must be pointed out
+that unlike many non-collusion assumptions there is no direct systemic relationships between the adversarial server
+providing fuzzytags and services built on top of such a service i.e. anyone can start providing a service and anyone
+can start entangling to a service without these being core features to the anonymity system.
+
+It must be stated that such an entangling scheme does not remove the risk that aggregate tags will reveal the 
+underlying social network  (as the signal is still there), it simply invokes a deniability argument for the relationship.
+This may be sufficient, but is not always going to be so.
+
+As such any simulation of this approach will likely be identical to one for not-entangling at all - with the false
+positive result replaced with a deniability disclaimer.
--- a/src/simulation-college-im.md
+++ b/src/simulation-college-im.md
@ -10,18 +10,24 @@ particular sender) to understand how much information is leaked by fuzzytags wit

 Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.

+![](./simulations/college-actual.jpeg)
+
 ## Scenario 1

 Setup:  20k events (7330 links). False positive rates: [0.007812, 0.5]. No entangling.

 Result: Server can identify ~4.3% of original graph (313 links) with a 12% false positive rate at threshold: 0.0001.

+![](./simulations/college-derived.jpeg)
+
 ## Scenario 2

 Setup:  20k events (7330 links). False positive rates: [0.007812, 0.5]. Every tag entangled to one random node (as before).

 Result: Server can identify ~3.95% of original graph (290 links) with a ~15% false positive rate.

+![](./simulations/college-derived-entangled.jpeg)
+
 # Discussion

 A very similar result to our observations on the EU Core email dataset, entangled tags increase the false positive
--- a/src/simulation-eu-core-email.md
+++ b/src/simulation-eu-core-email.md
@ -10,18 +10,25 @@ particular sender) to understand how much information is leaked by fuzzytags wit

 Citation: Ashwin Paranjape, Austin R. Benson, and Jure Leskovec. "Motifs in Temporal Networks." In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017.

+
+![](./simulations/eu-actual.jpeg)
+
 ## Scenario 1

 Setup: 1 month of email events between 1004 nodes, 20k events (5148 links). False positive rates: \[0.007812, 0.5\]. No entangling.

 Result: An adversarial server can identify ~7% of original graph (393 links) with a 6% false positive rate. Threshold: 0.0001

+![](./simulations/eu-derived.jpeg)
+
 ## Scenario 2

 Setup: The same month of emails, 20k events (5148 links) Same false positive rates. Every tag is entangled with 1 random node.

 Result: Server can identify ~6.6% of original graph with a 6.8% false positive rate.

+![](./simulations/eu-derived-entangled.jpeg)
+
 # Discussion 

 Entanglement seems to have some impact on the servers ability to relearn the social graph, in particular
--- a/src/simulations/college-actual.jpeg
+++ b/src/simulations/college-actual.jpeg
--- a/src/simulations/college-derived-entangled.jpeg
+++ b/src/simulations/college-derived-entangled.jpeg
--- a/src/simulations/college-derived.jpeg
+++ b/src/simulations/college-derived.jpeg
--- a/src/simulations/eu-actual.jpeg
+++ b/src/simulations/eu-actual.jpeg
--- a/src/simulations/eu-derived-entangled.jpeg
+++ b/src/simulations/eu-derived-entangled.jpeg
--- a/src/simulations/eu-derived.jpeg
+++ b/src/simulations/eu-derived.jpeg