The future-proof way of leveraging sensitive data

The sarus vault allows secure access

Sarus is the only solution that combines computation on the original data and generation of synthetic data — powered by Differential Privacy.

Get started

How to do analytics & AI on sensitive data with Sarus

Select data source

Select the data source to list on Sarus. Data types include numerical, categorical, series of events, images, and text. Common data storage and formats are supported.

Define user access policies

Define the rules governing each data practitioner's access. Create rule templates to allow for compliance best practices.

Leverage private data

Data practitioners connect to the Gateway from their favorite environment and interact  seamlessly with the original data remotely.


Privacy-first data access

"Data Cannot be Fully Anonymized and Remain Useful" (Cynthia Dwork, Godel prize and inventor of Differential Privacy).

From there, the most efficient way to achieve both high utility and strong privacy is to compute on non-anonymized data with protection on computation output. Data practitioners benefit from the full data utility without comprising on privacy.

The Sarus Gateway is the fruit of this  vision.

Deployed Anywhere

Sarus is deployed easily through containerization, and scales smoothly by running on Kubernetes — both on-premises or in public clouds.

secure BY DESIGN

When deployed, Sarus inherits all security properties of the original infrastructure and eliminates the need for moving data externally. All interactions with sensitive data go through the Sarus Gateway.

full fidelity

Output is always provably safe, regardless of the level of sensitivity of the input data. Practitioners can leverage the full fidelity of their data assets instead of using truncated, redacted, or synthetic versions.


Finest control over data access & full auditing capabilities


Next gen access control for sensitive data
Manage who can access which dataset and what they can do with it with unprecedented precision. Define privacy policies that can be deployed universally irrespective of data sensitivity, user trust, or learning objectives.

Scaling policies with mathematical privacy
Privacy policies should not be guesswork. Instead, use the mathematical framework of differential privacy to have a quantitative and replicable approach to risk management.

Full logging and auditing trail
Each access and each query goes through a gatekeeper that enforces all privacy settings. Every interaction with sensitive information is logged and available for reporting and auditing.

Dashboard screenshot

Finest control over data access with full auditing capabilities

Next-generation access control for sensitive data
Data access used to be granted on an all-or-nothing basis. Some users would get full access while the rest had no access at all.  With Sarus, it's easy to find the right level for all users and situations based on objective privacy goals.

Replace guesswork with mathematical privacy
Rely on the mathematical framework of differential privacy for a quantitative and replicable approach to data governance instead of guesswork on whether it protects privacy well enough.

Full logging and auditing trail
Every interaction with sensitive information is logged and available for reporting and auditing.


High fidelity synthetic samples

Synthetic CT scans

Synthetic vs original data
Maximum accuracy is only achievable using the original data. When this is not an option, synthetic data is an efficient alternative. Data practitioners use it to explore row-level information, prepare analyses, design or debug ML models, and can even export it to use in external applications. Sarus high utility synthetic data makes it seamless.

Available by default, private by design
The Sarus gateway automatically provides synthetic data for all datasets in a fully automated way. This synthetic data is generated using differential privacy so that it can shared with practitioners.

High quality all the time
Synthetic data generated via Sarus is superior to the state-of-the-art of data generation while adapting to any data structure (tabular, text, images, series of transactions). For more on the architecture that supports our synthetic data modelling, check out our paper.

Synthetic CT-scans

Built for all data science workflows

Sarus logo
SnowflakeGoogle Cloud PlatformRedshift

Use any data source
Connect any data source to the Sarus Gateway and it will be immediately accessible for analytics and AI applications. Sarus is compatible with tabular data, relational data, time-series, images, text, and more in most common formats.

Compatible with all main data environments and libraries
Sarus supports most data science use cases. It leverages existing execution engines (spark clusters, BigQuery, Synapse-SQL, Redshift...) or provides its own. The engines can be leveraged seamlessly from the most common data science environments and ML and BI libraries. The Sarus built-in SDK makes it easy to integrate  remote data seamlessly into your existing workflows.

Ready to put all of your data to work?

Get in touch, you'll be up and running in no time.
Get started


Subscribe to our newsletter

You're on the list! Thank you for signing up.
Oops! Something went wrong while submitting the form.
32, rue Alexandre Dumas
75011 Paris — France
©2022 Sarus Technologies.
All rights reserved.