Cloud KMS Fundamentals for Enterprise: Part 1
As a Cloud Security Engineer at Google Cloud, I get asked questions about Key Management Service (KMS) all the time as clients are migrating to the cloud and have to figure out how to map controls from their data center into the cloud. This two part blog is meant as a fundamentals for some very important concepts in encryption as they relate to the cloud in general but also Google Cloud specifically. We’ll cover topics that all build upon one another:
- Encryption basics
- Envelope encryption as it relates to KMS
- Client-Side vs Server-Side encryption and which one is actually useful
- Default Encryption, CMEK, CSEK, HSM and EKM
In Part 2, we’ll aggregate this knowledge and talk about patterns for Cloud KMS at scale for enterprises. These can all be fairly challenging topics but we’ll take it step by step and get there together!
Encryption Basics
Cryptography can be used for many different things from hashing passwords and BitCoin ledgers, to signing binaries and encrypting your hard drive. When we talk about encryption, there are generally two types, symmetric and asymmetric.
Symmetric encryption uses a single key for both encryption and decryption operations. Asymmetric encryption uses two keys, a public and private key, where the public key is used for encryption and the private key used for decryption.
Usually when we talk about KMS we’re referring to symmetric encryption where the KMS system is generating that single encryption key, although Cloud KMS does support asymmetric encryption too. The KMS system is usually responsible for generating and managing that encryption key. It also acts as a barrier to the key itself. This means if I have a file I want to encrypt, I would send that file to the KMS and it would respond with the ciphertext (encrypted). This makes it very difficult to “leak” a key.
Since we’re talking about communication, now seems like a decent time to talk about the difference between encryption in transit and at rest. Generally when we talk about encryption at rest, we’re referring to mitigating the threat of someone gaining access to our stored files on a disk. Encryption in transit generally deals with mitigating the threat of someone gaining access to our network traffic and eavesdropping, this is usually called a “man in the middle”. Encryption in transit is usually accomplished by internet protocols like TLS or IPSEC. KMS systems generally deal with encryption at rest. However, in KMS, you need to ensure that communications are encrypted in transit because the messages are inherently sensitive, i.e. plaintext messages that need to be encrypted.
This brings up the question, if I can only “use” my key but not “see” it, and I have to upload plaintext to have it encrypted, how can I fully encrypt a disk or a large file? Or what if I have to encrypt a field in every row of a database as it passes through a data processing pipeline? Do I have to make API calls to the KMS each time? This is where the concept of envelope encryption comes into play.
Envelope encryption
Simply put, envelope encryption is where the key you use to encrypt your data (Data Encryption Key or DEK) is encrypted by another Key Encryption Key (KEK). You may have heard these terms in reference to Google Cloud KMS already. Something you might not know is that this extra encryption provides no extra security, but rather an ease of management and reduction of latency. We’ll talk about two similar use cases:
File System Encryption
Let’s start with the simple example of encryption of a sensitive directory on your computer, say your home directory with all your photos. If we make use of a cloud-based KMS system, we wouldn’t want to upload the entire home directory file by file and get the values returned to us because it would take far too long. Instead we would generate a DEK to encrypt the files locally. But what do we do with that DEK so it’s stored securely in our operating system? Using our KMS system, we could upload only the DEK to be encrypted by a new key in our KMS, called the KEK that we would never see. Now when we want to decrypt our files, the operation looks like this:
- Take the file that includes our encrypted DEK
- Upload that file to Cloud KMS to decrypt it and get the plaintext in response
- Use the decrypted DEK to decrypt our files.
You never need to have access to the KEK at all and rotation of the key can happen without re-encrypting all your data.
Field-Level Encryption
Envelope encryption can also be used in a data-processing environment, where we have lots of entries being processed. This is especially important in the financial or healthcare fields. If you don’t use envelope encryption, you’ll be making one API call to KMS for every row you want to encrypt. This would not only make you hit your API quota very quickly but also massively slow down your pipeline due to round trip time of the internet. The same process as above can be used to encrypt with Cloud KMS. You would generate one or multiple DEKs that you’d use to encrypt your data row-by-row locally, encrypt the DEK(s) with a KEK that only the KMS system has access to. Now when your data processing system starts up, it only needs to make one API call to the KMS to decrypt the DEK(s) and then handle encryption operations locally with the DEK(s).
Envelope Encryption in Google Cloud
The best way to describe how envelope encryption works in Google Cloud is in the below diagram found in the Encryption Whitepaper.
The general idea here is the same as the previous two examples except it all happens transparently to the user. This paradigm is what we call Server Side or Transparent encryption.
Client Side vs. Server Side Encryption
When we say encryption is transparent, that means you only need access to the data to see it, not the key. While this ensures that your data is encrypted at the disk level, it doesn’t require the end user have access to the key. This is a common misconception in any cloud, not just GCP. Let’s take a look at a common example of securing a GCS bucket and how client-side is different from server-side encryption.
Server-Side Encryption
When we use the term server-side encryption (SSE), we’re referring to “Attaching a CMEK to the bucket” or the default encryption provided by Google, since under the hood they are doing the same thing. The difference is that with CMEK, you control the algorithm, rotation, etc.
With SSE, users only need permission to read the bucket to see the contents, not the KMS key that encrypts it (unless you’re using CSEK, which we’ll get to in a moment). This means that even if you use a CMEK backed by an HSM (presumably more secure), you are the same number of steps and permissions away from the bucket being made public. The primary threat that SSE is aiming to protect against is a government actor subpoenaing the drives from a Google data center in the case of some terrorist threat or other malfeasance. This is why typically I’d only recommend using CMEKs for SSE if you are under some regulatory authority that enforces tighter control over your keys. Otherwise, they don’t provide much extra protection over default encryption.
Client-Side Encryption
Contrary to SSE, Client-Side encryption (CSE) does provide an extra layer of security for your data and does ensure that data users must have access to your data as well as the key before they can see it in plaintext.
In this model, you would first use the KMS key to encrypt your file or data subsequently upload the ciphertext. This ensures that even if someone accidentally or maliciously made your bucket public, only encrypted data would be leaked. While this provides significantly better control than SSE, it also can be tricky to manage and generally requires some engineering to get it right at scale. However, with tools like Tink that make envelope encryption easy, you too can get CSE working for your data pipelines or other sensitive workloads.
Default Encryption, CMEK, HSM, CSEK and EKM
So many acronyms and so little time! Let’s try to quickly de-mystify the above in practical security terms of SSE and CSE that we just covered.
Default Encryption
Everything is encrypted at rest in Google Cloud, period. This is not a default, it’s something you can’t turn off. Default encryption uses envelope encryption as we showed above where Google manages the KEK in Cloud KMS which encrypts the DEK that is the thing that actually encrypts your data. Generally these keys can be shared between customers, which doesn’t play nice with certain regulatory requirements that require enterprises to maintain tighter control over their own keys. Enter CMEK, CSEK and EKM.
CMEK
A CMEK or Customer Managed Encryption Key can be used in both SSE and CSE. When you create a new key using gcloud kms keys create you are creating a CMEK. That same key can be used with gcloud kms encrypt/decrypt and also be attached to a bucket, GCE disk, etc. Using this same key with gcloud would fall under the CSE model, whereas attaching it to a data store would fall under SSE. You can also import your own key material to be used for the CMEK, which is useful for migrations.
Cloud HSM
Cloud HSM is a “behind the scenes” service that tells KMS how to generate and store your keys. According to Google’s Encryption at rest in Google Cloud white paper, KMS generally stores keys (CMEKs and Default Encryption keys) encrypted in memory and shared via a distributed system across multiple data centers for redundancy. Cloud HSM, basically generates your KEK from and stores it in an HSM (Hardware Security Module). This allows keys to be FIPS 140–2 Level 3 compliant because they are generated from a true analog random number generator as opposed to one that gains entropy from comparably predictable digital sources.
To use Cloud HSM, you simply need to specify your key protection level as “hardware” and you’re good to go. Just keep in mind that there is a significantly higher cost to using this option.
CSEK
An often referenced, but rarely used option for SSE is CSEK or Customer Supplied Encryption Keys.The reason it is rarely used is that it’s incredibly limiting and difficult to manage. CSEK basically allows you to supply an AES256 encryption key with every request to a bucket. Without the correct key, the GCS service will not be able to give you your data. This is why I mentioned this is the one exception to SSE not being transparent. You may be saying, why not use CSEK for all the things if it’s more secure? Well three reasons really:
- CSEK is only supported (at the time of this writing) in GCS and GCE
- You now need a secret management service that can securely store keys for you so you can fetch them every time you need to access data. KMS won’t work for this since it doesn’t give you access to key material that CSEK operations need.
- You are necessarily allowing data viewers access to raw key material just to view data, which you wouldn’t need to do if you just used CMEK in the CSE model.
EKM
A newer product in this space is called EKM or External Key Manager. This is similar to a Cloud HSM backed key, but it gives you the ability to back your CMEK with a third party cloud-based HSM. This is really best for customers who do not trust Google Cloud but still want to put their data in it. It allows you to encrypt your data with CMEK just like before, but every time that key is used, it makes an API call to another cloud-based HSM to handle the encryption. This means if you suspect that Google might comply with a government order to hand over data, you can simply remove access to that API and now Google has no way of decrypting your data. Now to be clear, if a government actor really wants your data, they can usually just subpoena the third party HSM solution you’re using (assuming it’s in the same country) so really all this does is buy you some time.
Conclusion
In this article we’ve talked about some of the fundamentals of KMS, how it works in cloud in general and then more specifically in Google Cloud. In the next part of this series, we’ll talk about some different practices for large enterprises to scale out their KMS usage and some things to be aware of from a security perspective.
Next Steps
Now that you have an idea of how KMS works in GCP, take a look at Part 2 of this blog series to understand how to scale your Cloud KMS usage.