Data scientists explore, preprocess data and feed it to LLMs without directly seeing the data. Only high-quality synthetic data and differentially-private stats can be retrieved from the clean room. To do so, data scientists use their usual AI and GenAI tools wrapped in the Sarus python SDK.
Differential Privacy guarantees can be included in the LLM fine-tuning process itself, through just a fit parameter. This ensures that no personal data is embedded in the fine-tuned model, thanks to automated Differentially-Private Stochastic Gradient Descent (DP-SGD).
Check out our demo notebook!
Build a synthetic patient records generator without compromising patient privacy