Concrete: A fully homomorphic encryption compiler

Concrete: A fully homomorphic encryption compiler(zama.ai)

108 points by zacchj 3 years ago | 22 comments

blintz 3 years ago |

Concrete is really impressive and permissively licensed. The ML library has a FHE version of (a subset of) scikit-learn, which I honestly thought I’d see in another 5+ years. Like look at this example:

    # Now we train in the clear and quantize the weights
    model = LogisticRegression(n_bits=8)
    model.fit(X_train, y_train)

    # We can simulate the predictions in the clear
    y_pred_clear = model.predict(X_test)

    # We then compile on a representative set 
    model.compile(X_train)

    # Finally we run the inference on encrypted inputs !
    y_pred_fhe = model.predict(X_test, fhe="execute")

    print("In clear  :", y_pred_clear)
    print("In FHE    :", y_pred_fhe)
    print(f"Similarity: {int((y_pred_fhe == y_pred_clear).mean()*100)}%")

There’s some ways to go on performance, but the ergonomics of using FHE are already pretty good!

binoua 3 years ago | |

Thank you! The python version is quite clear as well: still from the README,

``` from concrete import fhe

def add(x, y): return x + y

compiler = fhe.Compiler(add, {"x": "encrypted", "y": "encrypted"}) inputset = [(2, 3), (0, 0), (1, 6), (7, 7), (7, 1), (3, 2), (6, 1), (1, 7), (4, 5), (5, 4)]

print(f"Compiling...") circuit = compiler.compile(inputset)

print(f"Generating keys...") circuit.keygen()

examples = [(3, 4), (1, 2), (7, 7), (0, 0)] for example in examples: encrypted_example = circuit.encrypt(*example) encrypted_result = circuit.run(encrypted_example) result = circuit.decrypt(encrypted_result) print(f"Evaluation of {' + '.join(map(str, example))} homomorphically = {result}") ```

Here, that's more for non-ML computations.

sigmoid10 3 years ago | |

Isn't this basically just unnecessary overhead if you can do a forward pass with encrypted weights? Like, what are you still protecting with encryption at that point?

eyegor 3 years ago | | |

You're protecting your inputs and outputs. If you have a model that's designed to run with sensitive data but you want to don't have the compute power to run it locally, what do you do? Putting the model on a cloud provider means their system would see your sensitive data, which may be unacceptable for contractual or legal reasons. This lets you send the inputs encrypted, receive the outputs encrypted, then you can decrypt the outputs in your weak but trusted environment.

fd0r 3 years ago | | |

In the example above the parameters are in the clear and only inputs and outputs are encrypted!

That being said you could probably do the reverse and encrypt the parameters of the model and not the inputs/outputs if you are deploying the model directly to the client.

fire 3 years ago | | |

not op but I think i'm too dumb on this topic to understand what you mean, could you explain further? ( to me it sounds like you're suggesting using encrypted weights while they're suggesting using encrypted inputs which to me solves two different use cases )

dmos62 3 years ago | |

Aren't you running the second prediction on unencrypted data, contrary to what's said in the comment?

bczm 3 years ago | | |

Actually the input is encrypted in the ‘predict’ function here. There are functions to encryption, run, decrypt separately

mmastrac 3 years ago |

I started working on a CPU that was designed for FHE about 10 years ago, inspired by the ShapeCPU paper around that time [1] [2]. I've been waiting for someone to make a better gate-to-FHE compiler for some time.

[1] https://github.com/mmastrac/oblivious-cpu [2] https://hcrypt.com/shape-cpu/

FHE becomes a lot more interesting when you can hide the structure of your computation behind a VM.

xrd 3 years ago |

Compare to Google's here:

https://jeremykun.com/2023/02/13/googles-fully-homomorphic-e...

It's a really fun write up. I prefer the syntax of Google's. But Zama is doing great work.

I really enjoyed the podcast with them here. It clarified a lot for me about the intersection of FHE and ZK.

https://zeroknowledge.fm/248-2/

binoua 3 years ago | |

Would you mind elaborating what you prefer in Google's syntax, please?

royjacobs 3 years ago |

This reminds me of one of those software protection libraries. I think it was by Syncrosoft, the company that used to protect software like Cubase before it got acquired by the manufacturer of Cubase, Steinberg.

Basically you'd write your algorithms in C++ but instead of using the built-in types like int or float you'd use custom types that had all of their operators overloaded. Your code would look pretty similar to what you'd have before (modulo the type definitions) but when compiled your algorithm would turn into an incredibly inscrutable state machine where some parts of the state machine would come from some kind of protection dongle. Pretty effective.

eachro 3 years ago |

Does anyone know of a good reference to get up to speed on FHE in ML?

detrites 3 years ago | |

If you just want to dive right in, this example from Concrete ML's repository is very clear:

https://github.com/zama-ai/concrete-ml#a-simple-concrete-ml-...

eachro 3 years ago | | |

Ah I should have been a bit more clear. I'm interested in how FHE actually works and the steps needed to transform general computation to its FHE equivalent.

# Now we train in the clear and quantize the weights model = LogisticRegression(n_bits=8) model.fit(X_train, y_train) # We can simulate the predictions in the clear y_pred_clear = model.predict(X_test) # We then compile on a representative set model.compile(X_train) # Finally we run the inference on encrypted inputs ! y_pred_fhe = model.predict(X_test, fhe="execute") print("In clear :", y_pred_clear) print("In FHE :", y_pred_fhe) print(f"Similarity: {int((y_pred_fhe == y_pred_clear).mean()*100)}%")