What we are going to discuss in this article is how Snowflake uses encryption to secure your data. For most people, encryption can be confusing. Most of the time, different jargon gets thrown around that might sound good, but many people don’t understand it thoroughly.
Thus, asking the right questions and seeking additional information when relevant are the primary reasons we will discuss it.
Here is the breakdown of this article:
- The Basics of Data Encryption
- Snowflake Encryption in Transit
- Snowflake Encryption At Rest
- How Snowflake Encryption Works
- Snowflake Hierarchical Key Model
- Tri-Secret Secure & Customer-Managed Keys
- Encryption of Specific Data
The Basics of Data Encryption
Data encryption converts data into a different form or code that only users who have a secret key or password can read. It gets frequently automated as part of a data platform’s other procedures.
Encryption, in the context of a data cloud (or a cloud data warehouse), is separated into two areas: securing the data sent to and from the data cloud, which is known as securing data in transit and securing the data stored in tables, also known as encryption of data at rest. Both have become ubiquitous in recent years to mitigate real risks to your data.
In this context, Snowflake encrypts all customer data by default, using the latest security standards, at no additional cost. Snowflake provides best-in-class key management, which is entirely transparent to customers. This encryption makes Snowflake one of the easiest-to-use and most secure data platforms available.
Snowflake Encryption in Transit
When a user sends a query to Snowflake, it is delivered over the internet and returned to the user as a result set. Before reaching its destination, data travels across several networks, including the user’s home or office network, their Internet Service Provider (ISP), a global network carrier, a public cloud network, so on and so forth.
Interestingly, even when utilizing VPN (Virtual Private Network), the query must still leave the user’s computer and enter Snowflake. Because there are so many networks between the users and Snowflake, it could have been possible for someone to exploit data while it is in transit by acting as a “man in the middle.”
A man-in-the-middle (MITM) attack occurs when someone tries to intercept the data you give or receive from a service (in this case Snowflake) to gain your credentials or modify it without your knowledge, causing you to make incorrect data-driven decisions. With this in mind, encryption is one of the strategies used to defend against MITM attacks because it prevents attackers from accessing clear-text data sent over the network.
Today’s information security community prioritizes ensuring that the most current encryption standards get widely used, and Snowflake complies with these guidelines. Like many other online services today, Snowflake encrypts all sorts of communications by default and does not allow any communications that are not encrypted.
But, once you have sent your data to Snowflake, what happens to it? Continue reading to find out.
Snowflake Encryption At Rest
Snowflake provides end-to-end encryption (E2EE) to ensure that only end-users and the Snowflake runtime components have access to user data. Because the data is encrypted at rest and only decrypted in the memory of the Snowflake runtime components, even the cloud provider where the user’s Snowflake account gets deployed cannot read it.
Data can be loaded into Snowflake using one of two techniques. The first method is to use SQL statements such as INSERT and UPDATE. Snowflake’s built-in encryption of data in transit guarantees the integrity and privacy of these operations.
The second option is to import data into Snowflake by uploading files, which involves two steps: uploading files into a staging area and transferring the files into Snowflake tables. Notably, unloading data from Snowflake into a file is a third, optional step.
File Uploads to Staging Areas
Before a data file gets transferred into a table in Snowflake, you must first upload it to a staging area. Even if you assume that only authenticated users have access to staging areas, your data’s integrity and confidentiality get jeopardized if your credentials are compromised. Fortunately, Snowflake and the cloud providers make it simple to avoid that risk by providing built-in data encryption functionality.
There are two types of staging areas supported by Snowflake:
- Snowflake-provided Staging Area or Internal Stage – the Snowflake-managed file system where users can exchange data with Snowflake. When you need to upload files that are not yet in another storage bucket, use Snowflake-managed staging areas.
- Customer-provided Staging Area or External Stage– involves a cloud file system directory, similar to Amazon S3 or Google Cloud Storage buckets. Use the Customer-provided staging areas when you need to upload files previously saved to the cloud but get imported into Snowflake.
Snowflake-provided Staging Areas:
Snowflake automatically encrypts files before loading them into tables when uploaded to a staging area supplied by Snowflake. Internal staging areas are available at Snowflake in three different configurations:
- User Stages – Each user gets their staging area in Snowflake. Choose this option when data files should only be accessible by one user but may get copied to several tables.
- Table Stages – Each table has its staging area, which Snowflake allocates. Use this option when data files need to be accessible by several users but can only get copied to a single table.
- Named Stages – These are database objects that may be created, configured, and shared to provide users the most options and flexibility.
Customer-provided Staging Areas:
When uploading data to customer-provided or external stages, users can choose whether or not to encrypt the files. Snowflake manages data decryption by interacting with the cloud providers’ storage services’ native capabilities. While each cloud provider’s range of capabilities varies, the following choices are commonly available:
- No Encryption – The data is transferred from the client and saved on disk in the cloud provider’s data center in cleartext. While this approach is the simplest, it is obviously not recommended.
- Server-side Encryption – The data gets transferred from the client to the cloud provider, who then encrypts it before storing it on disk. This option is a fantastic solution since it strikes a decent balance between operational costs and security.
- Client-side Encryption – The data gets encrypted on the client before being uploaded to the cloud. This option is the safest choice because clear-text information does not transfer to the cloud; however, it necessitates more planning and works on the customer’s part.
Each of these methods has more than one flavor, with pros and cons to each. Generally speaking, the more control organizations have over the encryption process, the more required responsibility and effort.
The Use of Storage Integration
Creating a storage integration between Snowflake and the public cloud where the external stage’s storage is situated is better to connect to external storage than explicitly inserting credentials. This option provides the following security advantages:
- When implementing “CREATE STAGE,” users do not need to transfer credentials between queries.
- To have more control over where data is loaded from and into, users can directly specify locations for their stages.
How Snowflake Encryption Works
Snowflake can read data files uploaded to a staging area and transform the data into tables.
A table gets kept in one or more files that Snowflake saves using the cloud provider’s storage service. Snowflake automatically encrypts table files, each employing a distinct data encryption key to limit the amount of data that each key can access. With so many tables and files, each encrypted with a different key, keeping track of all of them might be difficult. Fortunately, Snowflake uses Key Wrapping, or Envelope Encryption, to make this operation easier.
The key used to encrypt a file gets stored alongside the file in key wrapping. It gets encrypted with a higher-level key that is deliberately private to keep the data encryption key safe. When the file has to get decrypted, the data encryption key is retrieved from it and decrypted with the higher-level key before being used to decrypt the file’s contents.
The necessity to store all of the encryption keys in clear text in a secure location, such as key management service, is eliminated with key wrapping. Instead, only the higher-level key gets kept in a safe place.
Snowflake Hierarchical Key Model
Snowflake also employs a sort of key wrapping to control the encryption of table files. As shown above, keys get grouped in a hierarchical scheme that contains the following levels:
- File Keys – used for the encryption or decryption of individual table files. File keys are encrypted and stored with the files.
- Table Master Keys – used for the encryption or decryption of the file keys. The table master keys are encrypted and saved in the table metadata.
- Account Master Keys – used for the encryption and decryption of the table master keys. Account master keys are encrypted and saved in the metadata of the account.
- Root Keys – used for the encryption and decryption of the account master keys. Root keys only get stored in a Hardware Security Module (HSM) and never get removed. The HSM is where all cryptographic operations take place. In addition, the root key is the only one that gets saved in clear text, which is why it never gets removed from the HSM.
Snowflake’s Hierarchical Key Model
Tri-Secret Secure & Customer-Managed Keys
Snowflake includes an additional feature called Tri-Secret Secure, which incorporates customer-owned keys into Snowflake’s key hierarchy for organizations who need more control over the keys used to secure their data in Snowflake.
Users produce their root key using the cloud provider’s key Management Service in Tri-Secret Secure (as shown below) and allow Snowflake access to it. To build a composite account master key, Snowflake employs both the customer-generated root key and the Snowflake-generated root key. Snowflake needs to access its HSM and the customer-controlled KMS to unwrap the composed account master key.
Without your permission to access the root key in your KMS, you cannot decrypt your data in Snowflake with Tri-Secret Secure. This security method also means that, in the event of a data breach, you can prevent Snowflake from decrypting any data, thereby shutting down all data processing in your Snowflake account.
However, as previously stated, if you are in charge of encryption keys, you must ensure that your company is willing to take on the responsibility of keeping them secure and accessible. If you do not meet that challenge, you will lose your data.
Encryption of Specific Data
Snowflake has access to the keys it needs to decode user data in all of the encryption methods outlined above, whether customer-provided or generated by Snowflake. Thus, there is a chance that a hacker will access some of your data if your keys are compromised.
Another option for gaining even more control over how your data is protected is never disclosing your keys with Snowflake. Application-level encryption is a type of client-side encryption in which the decryption keys are not exchanged with the server in advance, if at all.
Application-level encryption encrypts data before it gets put into Snowflake, and the data gets saved in encrypted form in Snowflake tables. The client must encrypt the data at the field level rather than the file level to operate. Snowflake will have no idea that data is even encrypted when done correctly.
Data encryption is a critical component of overall security. Each technological advancement makes security an even more essential obligation to ensure the safety of your data.
In this regard, Snowflake gets built to safeguard user data from vulnerabilities across the whole architecture, including the cloud platform. Snowflake assumes the awesome responsibility of protecting your data, whether your data is delivered or received to or from Snowflake. Ultimately, Snowflake offers a variety of encryption solutions, allowing you to tailor your data security measures according to your security requirements.