Screenshots and Initial Entangling Strategies Draft

This commit is contained in:
Sarah Jamie Lewis 2021-02-17 15:23:06 -08:00
parent 538c608a97
commit 4de674c984
11 changed files with 71 additions and 0 deletions

View File

@ -4,6 +4,7 @@
- [Terminology](./terminology.md)
- [Deploying Fuzzytags Securely](./deploying-fuzzytags.md)
- [Entangled Tags](./entangled.md)
- [Entangling Strategies](./entangling-strategies.md)
- [Simulating fuzzytags](./simulations.md)
- [Email EU Core Dataset Simulations](./simulation-eu-core-email.md)
- [College IM Dataset Simulations](./simulation-college-im.md)

View File

@ -7,6 +7,9 @@ This is done through $\texttt{FlagEntangled}$ - a function that takes in a vect
$\texttt{Flag}$ function, as documented in the original paper for each of them (with the same $r and z$) until it
finds a $z$ that will generate the same tag for every tagging key.
Here we briefly outline a number of properties of entangled tags. In the next section we will consider how to
apply these strategies in real-world deployments.
## Multiparty Broadcast
Alice wants to send a message to Bob and Carol. She constructs a single tag that will validate against detection keys generated by both of them.

View File

@ -0,0 +1,54 @@
# Entangling Strategies
In this section we will document a number of different strategies that applications might consider
when implementing an entangling scheme. Where possible we will provide analysis and simulations to assess privacy
impact of each one.
## No Entangling
For completeness, let's consider no entangling. In this case each tag has a single recipient. We have shown through
simulations that fuzzytags, in aggregate, leak information to the server. Being able to attribute a fuzzytag to a
receiver with confidence reduces the false positive rate on the derived graph.
On the other hand, all entangling strategies require access to a global database of tagging keys in order to be
effective - this has concrete privacy and security concerns in-and-of itself.
## Random Entangling
One of the simplest schemes involves every sender selecting a single random entangled recipient. We have simulated
this strategy against real world datasets and found that it does little to prevent the server from deriving the underlying
social graph - only marginally increasing the false positive rate of connections on the derived graph.
To understand why, consider that entangling tags in this way doesn't create a signal intended to corrupt the
derived graph, but instead adds noise to the number of received tags for each detection key. As the number of legitimate
tags received increases, they create a signal that stands above the noise floor of the generated tags.
## Deniable Entangling
As outlined in the last section, one of the major properties of entangled tags is deniable sending i.e. they make it
possible to legitimately send a message to multiple parties.
One possible deployment approach is to have a one or more known popular services using fuzzytags to receive messages,
and for all parties to entangle their messages to both the intended recipient *and* a popular service.
This approach provides two distinct benefits:
1. Only the tagging keys of popular services need to be known - these can be public and heavily popularized, discounting most
security and privacy concerns with maintaining a database of keys to entangle to.
2. Every tag in this scheme is individually deniability, increasing the anonymity set of all received tags.
3. Such services provide a bootstrapping mechanism for a network based on fuzzytags
The one major drawback of this approach centres around the addition of a number of select service providers into the
anonymity model resulting in a psuedo-non-collusion assumption, in addition to raising the spectre of compulsion attacks.
Both of these drawbacks can be mitigated through diversification of the entanglement pool - at it must be pointed out
that unlike many non-collusion assumptions there is no direct systemic relationships between the adversarial server
providing fuzzytags and services built on top of such a service i.e. anyone can start providing a service and anyone
can start entangling to a service without these being core features to the anonymity system.
It must be stated that such an entangling scheme does not remove the risk that aggregate tags will reveal the
underlying social network (as the signal is still there), it simply invokes a deniability argument for the relationship.
This may be sufficient, but is not always going to be so.
As such any simulation of this approach will likely be identical to one for not-entangling at all - with the false
positive result replaced with a deniability disclaimer.

View File

@ -10,18 +10,24 @@ particular sender) to understand how much information is leaked by fuzzytags wit
Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.
![](./simulations/college-actual.jpeg)
## Scenario 1
Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. No entangling.
Result: Server can identify ~4.3% of original graph (313 links) with a 12% false positive rate at threshold: 0.0001.
![](./simulations/college-derived.jpeg)
## Scenario 2
Setup: 20k events (7330 links). False positive rates: [0.007812, 0.5]. Every tag entangled to one random node (as before).
Result: Server can identify ~3.95% of original graph (290 links) with a ~15% false positive rate.
![](./simulations/college-derived-entangled.jpeg)
# Discussion
A very similar result to our observations on the EU Core email dataset, entangled tags increase the false positive

View File

@ -10,18 +10,25 @@ particular sender) to understand how much information is leaked by fuzzytags wit
Citation: Ashwin Paranjape, Austin R. Benson, and Jure Leskovec. "Motifs in Temporal Networks." In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017.
![](./simulations/eu-actual.jpeg)
## Scenario 1
Setup: 1 month of email events between 1004 nodes, 20k events (5148 links). False positive rates: \[0.007812, 0.5\]. No entangling.
Result: An adversarial server can identify ~7% of original graph (393 links) with a 6% false positive rate. Threshold: 0.0001
![](./simulations/eu-derived.jpeg)
## Scenario 2
Setup: The same month of emails, 20k events (5148 links) Same false positive rates. Every tag is entangled with 1 random node.
Result: Server can identify ~6.6% of original graph with a 6.8% false positive rate.
![](./simulations/eu-derived-entangled.jpeg)
# Discussion
Entanglement seems to have some impact on the servers ability to relearn the social graph, in particular

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB