Desilo Data Clean Room (DCR) is a data analytics platform that provides a secure space for fast and easy collaboration on distributed sensitive data. With the underlying technology of homomorphic encryption, Desilo DCR inherently eliminates data leakage risk and retains data quality, while enabling effective data analytics through an easy-to-use graphical user interface.
Homomorphic encryption is a state-of-the-art form of encryption that allows various computations — from statistical analytics to basic machine learning algorithms — to be performed directly on encrypted data. The result of the ciphertext computation is identical to the result of the plaintext computation, hence the term ‘homomorphic.’
All existing cloud services — Amazon AWS, Google Cloud Platform, Microsoft Azure and the likes — stores data in encrypted format, but when it has to process the data, it is decrypted to its raw and original form.
In contrast, with Desilo DCR, the data stays encrypted even when in processing. This means that no raw data ever leaves its original location — only the encrypted, unintelligible ciphertext travels through the analytics journey, and it stays encrypted the whole time. Any possibility of the raw data falling into the wrong hands is eliminated by design.
Desilo DCR can serve various demands in various domains. For example, take a case where multiple financial institutions try to collaborate for a better credit rating system, so they need to bring their datasets together.
One immediate problem is that these financial datasets are subject to very strong oversight and regulatory compliance. Another problem is that even if legally and technically possible, these datasets are each institution’s assets and are often central to that institution’s business value — so it would be unwilling to hand over its data, and rightfully so.
With Desilo DCR, however, no one needs to share any raw data — remember, no raw data ever leaves its original location. The collaborators perform the joint analytics and draw on the power of the combined datasets, while the privacy of the customers and the proprietary value of the data assets are never exposed.
In Desilo DCR, there are two types of users — data providers and data analysts — and the architecture largely consists of three elements — the data provider environment, the computing server, and the analytics interface.
Each data provider has its own data provider environment, where he can manage his datasets. Each data analyst has an account on the analyst interface, where he can create and execute “workflows” on the provided datasets, and see the results.
Simplified Architecture of Desilo Data Clean Room
The data provider environment is hosted within the boundaries of the data provider’s control — where the original data lives in its raw and sensitive state. This is also the one and only place where the provider’s secret key is, so that it is not shared with anybody else. The provider environment handles the key management, data pre-processing, data encryption/decryption, and communication with the computing server.
The most important thing is that, in any case, absolutely no raw data ever leaves the data provider environment. It has only two kinds of interactions with the outside — sending out a ciphertext, which can never be decrypted outside this environment, and decrypting a computation result, which has now been rendered insensitive through analytics.
The computing server may be hosted on a public cloud, a private cloud environment, as well as on-premise. It has three main functionalities:
- Communicate with the data provider environment
- Serve the analytics interface to the data analyst users
- Execute the “workflow” (workflow = analytics queries to be computed on ciphertexts)
Obviously, the core is the computation on ciphertext, made possible with homomorphic encryption. There are, however, additional policies and features to complete the picture. For instance, workflows are required to include a statistical function that results in an aggregate value — the scope of the feature offerings makes it impossible to query individual values from the dataset.
Once the ciphertext data is uploaded from the data provider environment to the dedicated computing server, the computing server executes the encrypted computations as per the workflow requested by the data analyst. After the the encrypted computations are completed, the computing server sends a request to the data providers — the owners of the datasets in the workflow — to decrypt the computation result.
The analytics interface is a web-based GUI attached to the computing server. Through this interface, the data analyst users access the provided datasets, create/execute workflows, and see the results.
Desilo DCR currently supports basic SQL-like functions (e.g. JOIN, FILTER, GROUP BY) and basic statistical functions (e.g. Average, Count, Covariance, Standard Deviation, Standard Error of the Mean, Pearson Correlation Coefficient, Sum, Variance, etc.). Also, the data analysts can define any arithmetic formula with any desired combination of columns from the provided datasets.
The analytics interface is designed such that the data analyst does not need any understanding of the underlying technology. All the technical details — cryptography, privacy, security, compliance — are invisible to the data analyst. These are all built in to the platform, by means of encryption, policies, and the scope of feature offerings. In short, it is inherently impossible for the data analyst to cause any harm. So the data analyst can simply focus on his job — that is, the data analytics — and the data safety is ensured by design.
Come and find out more about Desilo at: desilo.ai.