P2P Privacy

This commit is contained in:
Sarah Jamie Lewis 2018-07-11 12:40:20 -07:00
parent e7f5eb6f01
commit 2ae0f9ac8a
2 changed files with 54 additions and 4 deletions

View File

@ -5,13 +5,11 @@ title = "Federation is the Worst of all Worlds"
The threat model and economics of federated systems devolve to concentrating trust in the hands of a few, while missing out on the scale advantages of purely centralized solutions.
Federation results in the data of users being subject to the whims of the owner of the federated instance. Administrators can see plaintext correspondence and derive social graphs trivially. They are also in a position to selectively censor inter-instance communication. All of this while gaining none of the benefits of scale and discovery that centralized systems provide.
Federation results in the data of users being subject to the whims of the owner of the federated instance. Administrators can see correspondence and derive social graphs trivially. They are also in a position to selectively censor inter-instance communication. All of this while gaining none of the benefits of scale and discovery that centralized systems provide.
All the privacy issues, none of the scale advantages.
Is that inherent to federated systems? I would argue yes, at least in the context of how we think about federation today.
To achieve some level of usability, to present the user with a known interface we expect them to trust the federated server.
Is that inherent to federated systems? I would argue yes, at least in the context of how we think about federation today. To achieve some level of usability, to present the user with a known interface we expect them to trust the federated server.
Without building consent and resistance into the protocol and infrastructure, we're just forcing most users to pick a new dictator for their data without any real basis for that choice.

View File

@ -0,0 +1,52 @@
+++
date = "2018-07-12T12:21:00-07:00"
title = "Privacy and P2P Protocols"
+++
In a modern P2P protocol white paper, under a section titled *Network Privacy*, there is a section that reads: "There is an inherent tradeoff in peer to peer systems of source discovery vs. user privacy." I disagree with the statement & the impact resulting design decisions have on privacy.
The system defines a source discovery as the IP:Port pairing of a peer that has access to the data you want.
(I really wish we wouldn't build P2P networks directly on top of IP addresses. We have better overlay tech...)
To clarify some terms, Content Discovery and Source Discovery are two different (but related) problems. Content Discovery is the problem of finding that some content exists. Source Discovery is the problem of knowing where to get it. To use an example from the old web, Content Discovery is often solved by Google. Source Discovery is is simply a direct connection to a URI.
In the P2P world, if I have a document identifier the problem becomes how do I translate that to the actual document.
(In old web terms, I have a URI, I do DNS resolution to get an IP address and I initiate a HTTP connection and the server sends me the content)
P2P tech complicates the solution because, often, a document identifier no longer belongs to a single server, it has been published to the network and could be living anywhere, in multiple places, hosted by many different, dynamic peers.
And thus, many P2P systems state that this constraint makes privacy hard. The reasoning goes that because a peer must ask other peers if they know where to get the document, and the more peers they ask the more peers who know that they have asked for a given document. This widening metadata problem is, to many, intractable.
Of course we know that Freenet presented solutions that provided pretty strong guarantees for reader anonymity and publisher anonymity nearly 2 decades ago. We've known that we can do private source discovery in P2P networks for literal decades.
Further,We now have networks like Tor and i2p which present really neat peer addressing solutions that anonymize IP endpoints and protect publishers and readers almost out of the box. (with [some caveats](https://onionscan.org))
We *know* that metadata analysis is *the* thing that drives mass surveillance systems. Why in 2018 are we building new P2P networks that don't offer any reasonable privacy guarantees against mass surveillance capable adversaries?!
It seems like a large number of people are talking about "the new web" but it seems clear to me that these technologies and protocols have learned none of the security lessons we've been taught in the last few decades.
The point of P2P tech is to distribute trust. You can't distribute trust without consent. You can't consent without meaningful privacy. **Privacy should be a foundational element in any P2P stack**, not a challenge, or a footnote, or a "maybe we will get to this in the future".
When I originally wrote this I was picking on a particular protocol. I've redacted that protocol from this copy because the leaders in that community responded really positively to critiques about the privacy issues with their protocol. I also made it clear in the original version that this isn't a problem with just a single protocol, it's a problem endemic to the new generation of P2P systems.
I don't know why. Maybe developers think that anonymizing networks are slow and have limited bandwidth.
(This is partially true, but mostly a resource issue not a fundamental tech issue)
Maybe it's a knowledge problem. [Surveillance and privacy are marginalized issues that impact different communities unevenly](https://leanpub.com/queerprivacy). Solutions to these problems exist but may not be considered by those building the systems as high priority.
Maybe I'm being over demanding of a bunch of community led, open source projects to consider use cases that they don't have the bandwidth to consider.
I want to live in a world where we have a diverse set of P2P systems. I want these projects to thrive. Fundamentally I think that is the only way we can hope to achieve decentralization of trust and a free and open internet ecosystem that resists censorship and surveillance.
But we have to build privacy into these systems from the ground up, at the first platform layer, not the application layer - application privacy **does not work**.
I'm not sure if the current generations of systems can have privacy built into them. My experience and intuition says *probably not*. Privacy is really hard to layer onto a system after design.
I'll leave this with the follow thought:
Privacy is not an optional design element. When you refuse to build privacy into a system you are further marginalizing populations, enforcing censorship and encouraging surveillance.
When you refuse to build privacy into a system you are stating that you believe that only certain types of people should be able to use your system, and only for certain things. You might not intend that, but that is fundamentally the result.