fuzzyhash/README.md

## fuzzyhash - A toy (s-t)Detectable Hash Function

This package contains a toy implementation of an (s-t)Detectable Hash Function as described in [The Apple PSI System](https://www.apple.com/child-safety/pdf/Apple_PSI_System_Security_Protocol_and_Analysis.pdf) by Abhishek Bhowmick, Dan Boneh, Steve Myers, Kunal Talwa, and Karl Tarbe.

**WARNING: This is a toy implementation. Do not use this as anything other than a toy.**


### How it Works

A client creates new `HashKey` which instantiates the secret key behind the scenes.

A client can then use this `HashKey` to produce two kinds of `Hash`. 

The first kind of hash (`Hash(x)`) "a true hash" multiplies the given `x` by the secret key (treating the secret key as a `s x t` matrix of polynomials (`p_1..p_s`)). `DHF(k,x):=(x,p_1(x),...,p_s(x))`

The second kind of hash is a random hash of the same size as a true hash, but all the elements are selected randomly
from the domain.

#### The Detection Algorithm

Given a set of hashes, a `Solver` is able to distinguish true hashes from random hashes as long as there are
at least `t+1` true hashes and a maximum of `s` random hashes.

The solver does this by computing the kernel basis vectors of a Matrix containing an **extended** version of the 
hashes which transforms the upper portion of the matrix into a Vandermonde matrix, and the lower portion contains
the evaluated polynomials from the hashes themselves.

Our solver does this by first appending the identity matrix to the lower portion of the matrix, then transposing,
converting the matrix to row-echelon form, and transposing again - at which point the kernel basis vectors, if there
are any, are found in the lower portion of matrix. 

(If there are no basis vectors then the algorithm returns an error as there are not enough true hashes
in the set to compute a solution)

We then check the kernel basis for rows where all values are zero, and eliminate these as random rows. What is left
is the solution to the detection algorithm - the indexes of the original hashes which are actually true hashes.

### Sample Run


    Hash 0: Hash([PrimeOrderDomain { val: 387 }, PrimeOrderDomain { val: 62 }, PrimeOrderDomain { val: 515 }, PrimeOrderDomain { val: 10 }, PrimeOrderDomain { val: 338 }])
    Hash 1: Hash([PrimeOrderDomain { val: 297 }, PrimeOrderDomain { val: 662 }, PrimeOrderDomain { val: 463 }, PrimeOrderDomain { val: 164 }, PrimeOrderDomain { val: 576 }])
    Hash 2: Hash([PrimeOrderDomain { val: 330 }, PrimeOrderDomain { val: 364 }, PrimeOrderDomain { val: 765 }, PrimeOrderDomain { val: 139 }, PrimeOrderDomain { val: 96 }])
    Hash 3: Hash([PrimeOrderDomain { val: 192 }, PrimeOrderDomain { val: 491 }, PrimeOrderDomain { val: 602 }, PrimeOrderDomain { val: 537 }, PrimeOrderDomain { val: 588 }])
    Hash 4: Hash([PrimeOrderDomain { val: 535 }, PrimeOrderDomain { val: 873 }, PrimeOrderDomain { val: 84 }, PrimeOrderDomain { val: 570 }, PrimeOrderDomain { val: 639 }])
    Hash 5: Hash([PrimeOrderDomain { val: 731 }, PrimeOrderDomain { val: 440 }, PrimeOrderDomain { val: 694 }, PrimeOrderDomain { val: 235 }, PrimeOrderDomain { val: 753 }])
    Hash 6: Hash([PrimeOrderDomain { val: 484 }, PrimeOrderDomain { val: 186 }, PrimeOrderDomain { val: 658 }, PrimeOrderDomain { val: 30 }, PrimeOrderDomain { val: 127 }])
    Hash 7: Hash([PrimeOrderDomain { val: 545 }, PrimeOrderDomain { val: 881 }, PrimeOrderDomain { val: 36 }, PrimeOrderDomain { val: 371 }, PrimeOrderDomain { val: 654 }])
    Hash 8: Hash([PrimeOrderDomain { val: 370 }, PrimeOrderDomain { val: 883 }, PrimeOrderDomain { val: 24 }, PrimeOrderDomain { val: 543 }, PrimeOrderDomain { val: 436 }])
    Hash 9: Hash([PrimeOrderDomain { val: 405 }, PrimeOrderDomain { val: 145 }, PrimeOrderDomain { val: 345 }, PrimeOrderDomain { val: 879 }, PrimeOrderDomain { val: 428 }])
    
    
        1   387   753   475   216    62   515    10   338     1     0     0     0     0     0     0     0     0     0 ;
        1   297   396   528   704   662   463   164   576     0     1     0     0     0     0     0     0     0     0 ;
        1   330   686   195   486   364   765   139    96     0     0     1     0     0     0     0     0     0     0 ;
        1   192   497   515   423   491   602   537   588     0     0     0     1     0     0     0     0     0     0 ;
        1   535   611   469   781   873    84   570   639     0     0     0     0     1     0     0     0     0     0 ;
        1   731   387   831   753   440   694   235   753     0     0     0     0     0     1     0     0     0     0 ;
        1   484    88    16   648   186   658    30   127     0     0     0     0     0     0     1     0     0     0 ;
        1   545   767   238   208   881    36   371   654     0     0     0     0     0     0     0     1     0     0 ;
        1   370   302   865   730   883    24   543   436     0     0     0     0     0     0     0     0     1     0 ;
        1   405   817    34   465   145   345   879   428     0     0     0     0     0     0     0     0     0     1 ;
    
    
        1     0     0     0     0     0     0     0     0     0 ;
    387   870     0     0     0     0     0     0     0     0 ;
    753   436   630     0     0     0     0     0     0     0 ;
    475   390   285   802     0     0     0     0     0     0 ;
    216   514   428   220   710     0     0     0     0     0 ;
    62   821   274   518    43   643     0     0     0     0 ;
    515   396   458   151   823   777   532     0     0     0 ;
    10   533   442   716   677   721   673   760     0     0 ;
    338    98   559    41   463   754   718   832     0     0 ;
    ----------------------------------------
        1   886   311   350   312   413   649   120   867     0 ;
        0     0     0     0     0     0     0     0     1   303 ;
        0     0     0     0     0     0     1   851     0     0 ;
        0     0     0     1   510   173   749   875   641   860 ;
        0     0     0     0     0     0     0     0     0     1 ;
        0     0     0     0     0     1   304   795     0     0 ;
        0     0     0     0     1     7    97   332    71    51 ;
        0     0     0     0     0     0     0     1   527   156 ;
        0     1   575    42    58   838   716   213   554   403 ;
        0     0     1   494     6   342   145   361     0     0 ;
    Found 2 Kernel Basis Vectors...
    Solution: [0, 1, 3, 4, 6, 7, 8]
Initial Commit 2021-08-14 17:57:08 +00:00			`## fuzzyhash - A toy (s-t)Detectable Hash Function`

			`This package contains a toy implementation of an (s-t)Detectable Hash Function as described in [The Apple PSI System](https://www.apple.com/child-safety/pdf/Apple_PSI_System_Security_Protocol_and_Analysis.pdf) by Abhishek Bhowmick, Dan Boneh, Steve Myers, Kunal Talwa, and Karl Tarbe.`

Some more comments 2021-08-14 18:07:58 +00:00			`WARNING: This is a toy implementation. Do not use this as anything other than a toy.`
Initial Commit 2021-08-14 17:57:08 +00:00

			`### How it Works`

Use an actual FF lib 2021-08-15 22:21:46 +00:00			A client creates new `HashKey` which instantiates the secret key behind the scenes.
Initial Commit 2021-08-14 17:57:08 +00:00
Use an actual FF lib 2021-08-15 22:21:46 +00:00			A client can then use this `HashKey` to produce two kinds of `Hash`.
Initial Commit 2021-08-14 17:57:08 +00:00
			The first kind of hash (`Hash(x)`) "a true hash" multiplies the given `x` by the secret key (treating the secret key as a `s x t` matrix of polynomials (`p_1..p_s`)). `DHF(k,x):=(x,p_1(x),...,p_s(x))`

			`The second kind of hash is a random hash of the same size as a true hash, but all the elements are selected randomly`
			`from the domain.`

			`#### The Detection Algorithm`

			Given a set of hashes, a `Solver` is able to distinguish true hashes from random hashes as long as there are
			at least `t+1` true hashes and a maximum of `s` random hashes.

			`The solver does this by computing the kernel basis vectors of a Matrix containing an extended version of the`
			`hashes which transforms the upper portion of the matrix into a Vandermonde matrix, and the lower portion contains`
			`the evaluated polynomials from the hashes themselves.`

			`Our solver does this by first appending the identity matrix to the lower portion of the matrix, then transposing,`
			`converting the matrix to row-echelon form, and transposing again - at which point the kernel basis vectors, if there`
			`are any, are found in the lower portion of matrix.`

			`(If there are no basis vectors then the algorithm returns an error as there are not enough true hashes`
			`in the set to compute a solution)`

			`We then check the kernel basis for rows where all values are zero, and eliminate these as random rows. What is left`
			`is the solution to the detection algorithm - the indexes of the original hashes which are actually true hashes.`

			`### Sample Run`


			`Hash 0: Hash([PrimeOrderDomain { val: 387 }, PrimeOrderDomain { val: 62 }, PrimeOrderDomain { val: 515 }, PrimeOrderDomain { val: 10 }, PrimeOrderDomain { val: 338 }])`
			`Hash 1: Hash([PrimeOrderDomain { val: 297 }, PrimeOrderDomain { val: 662 }, PrimeOrderDomain { val: 463 }, PrimeOrderDomain { val: 164 }, PrimeOrderDomain { val: 576 }])`
			`Hash 2: Hash([PrimeOrderDomain { val: 330 }, PrimeOrderDomain { val: 364 }, PrimeOrderDomain { val: 765 }, PrimeOrderDomain { val: 139 }, PrimeOrderDomain { val: 96 }])`
			`Hash 3: Hash([PrimeOrderDomain { val: 192 }, PrimeOrderDomain { val: 491 }, PrimeOrderDomain { val: 602 }, PrimeOrderDomain { val: 537 }, PrimeOrderDomain { val: 588 }])`
			`Hash 4: Hash([PrimeOrderDomain { val: 535 }, PrimeOrderDomain { val: 873 }, PrimeOrderDomain { val: 84 }, PrimeOrderDomain { val: 570 }, PrimeOrderDomain { val: 639 }])`
			`Hash 5: Hash([PrimeOrderDomain { val: 731 }, PrimeOrderDomain { val: 440 }, PrimeOrderDomain { val: 694 }, PrimeOrderDomain { val: 235 }, PrimeOrderDomain { val: 753 }])`
			`Hash 6: Hash([PrimeOrderDomain { val: 484 }, PrimeOrderDomain { val: 186 }, PrimeOrderDomain { val: 658 }, PrimeOrderDomain { val: 30 }, PrimeOrderDomain { val: 127 }])`
			`Hash 7: Hash([PrimeOrderDomain { val: 545 }, PrimeOrderDomain { val: 881 }, PrimeOrderDomain { val: 36 }, PrimeOrderDomain { val: 371 }, PrimeOrderDomain { val: 654 }])`
			`Hash 8: Hash([PrimeOrderDomain { val: 370 }, PrimeOrderDomain { val: 883 }, PrimeOrderDomain { val: 24 }, PrimeOrderDomain { val: 543 }, PrimeOrderDomain { val: 436 }])`
			`Hash 9: Hash([PrimeOrderDomain { val: 405 }, PrimeOrderDomain { val: 145 }, PrimeOrderDomain { val: 345 }, PrimeOrderDomain { val: 879 }, PrimeOrderDomain { val: 428 }])`


			`1 387 753 475 216 62 515 10 338 1 0 0 0 0 0 0 0 0 0 ;`
			`1 297 396 528 704 662 463 164 576 0 1 0 0 0 0 0 0 0 0 ;`
			`1 330 686 195 486 364 765 139 96 0 0 1 0 0 0 0 0 0 0 ;`
			`1 192 497 515 423 491 602 537 588 0 0 0 1 0 0 0 0 0 0 ;`
			`1 535 611 469 781 873 84 570 639 0 0 0 0 1 0 0 0 0 0 ;`
			`1 731 387 831 753 440 694 235 753 0 0 0 0 0 1 0 0 0 0 ;`
			`1 484 88 16 648 186 658 30 127 0 0 0 0 0 0 1 0 0 0 ;`
			`1 545 767 238 208 881 36 371 654 0 0 0 0 0 0 0 1 0 0 ;`
			`1 370 302 865 730 883 24 543 436 0 0 0 0 0 0 0 0 1 0 ;`
			`1 405 817 34 465 145 345 879 428 0 0 0 0 0 0 0 0 0 1 ;`



			`1 0 0 0 0 0 0 0 0 0 ;`
			`387 870 0 0 0 0 0 0 0 0 ;`
			`753 436 630 0 0 0 0 0 0 0 ;`
			`475 390 285 802 0 0 0 0 0 0 ;`
			`216 514 428 220 710 0 0 0 0 0 ;`
			`62 821 274 518 43 643 0 0 0 0 ;`
			`515 396 458 151 823 777 532 0 0 0 ;`
			`10 533 442 716 677 721 673 760 0 0 ;`
			`338 98 559 41 463 754 718 832 0 0 ;`
			`----------------------------------------`
			`1 886 311 350 312 413 649 120 867 0 ;`
			`0 0 0 0 0 0 0 0 1 303 ;`
			`0 0 0 0 0 0 1 851 0 0 ;`
			`0 0 0 1 510 173 749 875 641 860 ;`
			`0 0 0 0 0 0 0 0 0 1 ;`
			`0 0 0 0 0 1 304 795 0 0 ;`
			`0 0 0 0 1 7 97 332 71 51 ;`
			`0 0 0 0 0 0 0 1 527 156 ;`
			`0 1 575 42 58 838 716 213 554 403 ;`
			`0 0 1 494 6 342 145 361 0 0 ;`
			`Found 2 Kernel Basis Vectors...`
			`Solution: [0, 1, 3, 4, 6, 7, 8]`