fuzzyhash/README.md

88 lines
6.3 KiB
Markdown
Raw Normal View History

2021-08-14 17:57:08 +00:00
## fuzzyhash - A toy (s-t)Detectable Hash Function
This package contains a toy implementation of an (s-t)Detectable Hash Function as described in [The Apple PSI System](https://www.apple.com/child-safety/pdf/Apple_PSI_System_Security_Protocol_and_Analysis.pdf) by Abhishek Bhowmick, Dan Boneh, Steve Myers, Kunal Talwa, and Karl Tarbe.
2021-08-14 18:07:58 +00:00
**WARNING: This is a toy implementation. Do not use this as anything other than a toy.**
2021-08-14 17:57:08 +00:00
### How it Works
2021-08-15 22:21:46 +00:00
A client creates new `HashKey` which instantiates the secret key behind the scenes.
2021-08-14 17:57:08 +00:00
2021-08-15 22:21:46 +00:00
A client can then use this `HashKey` to produce two kinds of `Hash`.
2021-08-14 17:57:08 +00:00
The first kind of hash (`Hash(x)`) "a true hash" multiplies the given `x` by the secret key (treating the secret key as a `s x t` matrix of polynomials (`p_1..p_s`)). `DHF(k,x):=(x,p_1(x),...,p_s(x))`
The second kind of hash is a random hash of the same size as a true hash, but all the elements are selected randomly
from the domain.
#### The Detection Algorithm
Given a set of hashes, a `Solver` is able to distinguish true hashes from random hashes as long as there are
at least `t+1` true hashes and a maximum of `s` random hashes.
The solver does this by computing the kernel basis vectors of a Matrix containing an **extended** version of the
hashes which transforms the upper portion of the matrix into a Vandermonde matrix, and the lower portion contains
the evaluated polynomials from the hashes themselves.
Our solver does this by first appending the identity matrix to the lower portion of the matrix, then transposing,
converting the matrix to row-echelon form, and transposing again - at which point the kernel basis vectors, if there
are any, are found in the lower portion of matrix.
(If there are no basis vectors then the algorithm returns an error as there are not enough true hashes
in the set to compute a solution)
We then check the kernel basis for rows where all values are zero, and eliminate these as random rows. What is left
is the solution to the detection algorithm - the indexes of the original hashes which are actually true hashes.
### Sample Run
Hash 0: Hash([PrimeOrderDomain { val: 387 }, PrimeOrderDomain { val: 62 }, PrimeOrderDomain { val: 515 }, PrimeOrderDomain { val: 10 }, PrimeOrderDomain { val: 338 }])
Hash 1: Hash([PrimeOrderDomain { val: 297 }, PrimeOrderDomain { val: 662 }, PrimeOrderDomain { val: 463 }, PrimeOrderDomain { val: 164 }, PrimeOrderDomain { val: 576 }])
Hash 2: Hash([PrimeOrderDomain { val: 330 }, PrimeOrderDomain { val: 364 }, PrimeOrderDomain { val: 765 }, PrimeOrderDomain { val: 139 }, PrimeOrderDomain { val: 96 }])
Hash 3: Hash([PrimeOrderDomain { val: 192 }, PrimeOrderDomain { val: 491 }, PrimeOrderDomain { val: 602 }, PrimeOrderDomain { val: 537 }, PrimeOrderDomain { val: 588 }])
Hash 4: Hash([PrimeOrderDomain { val: 535 }, PrimeOrderDomain { val: 873 }, PrimeOrderDomain { val: 84 }, PrimeOrderDomain { val: 570 }, PrimeOrderDomain { val: 639 }])
Hash 5: Hash([PrimeOrderDomain { val: 731 }, PrimeOrderDomain { val: 440 }, PrimeOrderDomain { val: 694 }, PrimeOrderDomain { val: 235 }, PrimeOrderDomain { val: 753 }])
Hash 6: Hash([PrimeOrderDomain { val: 484 }, PrimeOrderDomain { val: 186 }, PrimeOrderDomain { val: 658 }, PrimeOrderDomain { val: 30 }, PrimeOrderDomain { val: 127 }])
Hash 7: Hash([PrimeOrderDomain { val: 545 }, PrimeOrderDomain { val: 881 }, PrimeOrderDomain { val: 36 }, PrimeOrderDomain { val: 371 }, PrimeOrderDomain { val: 654 }])
Hash 8: Hash([PrimeOrderDomain { val: 370 }, PrimeOrderDomain { val: 883 }, PrimeOrderDomain { val: 24 }, PrimeOrderDomain { val: 543 }, PrimeOrderDomain { val: 436 }])
Hash 9: Hash([PrimeOrderDomain { val: 405 }, PrimeOrderDomain { val: 145 }, PrimeOrderDomain { val: 345 }, PrimeOrderDomain { val: 879 }, PrimeOrderDomain { val: 428 }])
1 387 753 475 216 62 515 10 338 1 0 0 0 0 0 0 0 0 0 ;
1 297 396 528 704 662 463 164 576 0 1 0 0 0 0 0 0 0 0 ;
1 330 686 195 486 364 765 139 96 0 0 1 0 0 0 0 0 0 0 ;
1 192 497 515 423 491 602 537 588 0 0 0 1 0 0 0 0 0 0 ;
1 535 611 469 781 873 84 570 639 0 0 0 0 1 0 0 0 0 0 ;
1 731 387 831 753 440 694 235 753 0 0 0 0 0 1 0 0 0 0 ;
1 484 88 16 648 186 658 30 127 0 0 0 0 0 0 1 0 0 0 ;
1 545 767 238 208 881 36 371 654 0 0 0 0 0 0 0 1 0 0 ;
1 370 302 865 730 883 24 543 436 0 0 0 0 0 0 0 0 1 0 ;
1 405 817 34 465 145 345 879 428 0 0 0 0 0 0 0 0 0 1 ;
1 0 0 0 0 0 0 0 0 0 ;
387 870 0 0 0 0 0 0 0 0 ;
753 436 630 0 0 0 0 0 0 0 ;
475 390 285 802 0 0 0 0 0 0 ;
216 514 428 220 710 0 0 0 0 0 ;
62 821 274 518 43 643 0 0 0 0 ;
515 396 458 151 823 777 532 0 0 0 ;
10 533 442 716 677 721 673 760 0 0 ;
338 98 559 41 463 754 718 832 0 0 ;
----------------------------------------
1 886 311 350 312 413 649 120 867 0 ;
0 0 0 0 0 0 0 0 1 303 ;
0 0 0 0 0 0 1 851 0 0 ;
0 0 0 1 510 173 749 875 641 860 ;
0 0 0 0 0 0 0 0 0 1 ;
0 0 0 0 0 1 304 795 0 0 ;
0 0 0 0 1 7 97 332 71 51 ;
0 0 0 0 0 0 0 1 527 156 ;
0 1 575 42 58 838 716 213 554 403 ;
0 0 1 494 6 342 145 361 0 0 ;
Found 2 Kernel Basis Vectors...
Solution: [0, 1, 3, 4, 6, 7, 8]