Sunday, November 19, 2017

AWS KMS: MetaStore and MostRecentProvider Explained

I couldn't find suitable documentation on this and was hard for me when I had to implement a solution using MetaStore (MS) and MostRecentProvider (MRP). Here is the blog with my understanding. Hope this will be helpful.

MS and MRP, together will help improve performance of KMS. This is particularly useful if you have a usecase where lots of calls are made to DynamoDB for data retrieval and there are stringent NFRs to adhere with.

Before going ahead with MS and MRP, let me give a overview of how encryption/decryption happens using DirectKMSMaterialProvider. This is one of the EncryptionMaterialsProvider implementation which is given in one of AWS blogs

Lets say, I store customer data in DynamoDB and each row has 20 fields. Out of 20 fields, I'm encrypting 10 as they are PII (Personal Identifiable Information) data. How KMS works is
With DirectKMSMaterialProvider, to encrypt 1 record, there will be 10 calls to KMS as KMS will encrypt each field individually. This proves to be very costly operation as think of a scenario where we have thousands of calls made per second and there are hundreds or thousands of records to be retrieved in a single call within say 10 ms. Every call to dynamoDB is having an additional latency here which is encryption/decryption for data insertion/retrieval respectively. This defeats the purpose of using dynamoDB for performance reasons.

AWS has multiple EncryptoinMaterialsProvider implementations. The whole issue here is due to the way Envelope encryption happens in DirectKMSMaterialProvider of KMS. Before proceeding further, if you wish to hover on basic concepts of KMS, visit my blog here.

MetaStore:

Alright.. treat MS as some meta data. This is a simple collection of EncryptionMaterialProviders backed by an encrypted DynamoDB table. MS can be used to build key hierarchies or meta providers. Cool.. not good enough explanation?? May be because I just copied the above lines from MS documentation. Lets go a step further to understand what it is.

MS is used to store encrypted dataKeys. Simple. A collection of encrypted dataKeys. Where we store? We store in a DynamoDB table. If we see the constructor of MS:


public MetaStore(final AmazonDynamoDB ddb, final String tableName,
            final DynamoDBEncryptor encryptor) 

This takes AmazonDynamoDB that has connectionDetails, tableName which is the dynamoDB table that will hold encrypted dataKeys and DynamoDBEncryptor which is the EncryptionMaterialsProvider.

creating an instance of MS will initialize and create the dynamoDB table with the given tableName. This class has other APIs to manage MS.
MS will always be used in conjunction with MRP. I've used spring to create MS class. Will provide a snippet after explaining MRP that will help understand how these two classes work together.

MostRecentProvider:

As the name suggests, this class is used to encrypt the data with the most recent version of the key materials from MS. It has the intelligence to decrypt with whichever version it used to encrypt. Every dataKey in MS will have a version attached to it. MRP will always work using latest version. Will explain this in detail shortly.

For now, lets look at how the constructor of MetaRecentProvider looks like.

public MostRecentProvider(final ProviderStore keystore, final String materialName, final long ttlInMillis)

So, first argument is the MS we created, and then the materialName and ttlInMillis arguments. Let me give few words about last two parameters.

MaterialName: Collection of dataKeys in MS can be grouped under a materialName and can be used to protect certain table. Using this, we can have separate dataKeys for each table if needed. Or, have multiple levels of MS to protect data each with a materialName.

ttlInMillis: We know that data stored in dynamoDB table is encrypted dataKeys. Encrypted dataKeys are not useful for encrypt/decrypt unless they are decrypted. So, dataKeys are decrypted using CMK and are then cached in memory to avoid multiple calls to AWS KMS. How often the cache will be refreshed is configurable and ttlInMillis gives that value. Simple.

Now, we know how MS and MRP are linked. Let me explain bit more about some APIs to manage MS

For example: Lets say I implemented MS with a dynamoDB table named KeyStoreTable. There is a API in MS named newProvider. Application will call this API whenever a new dataKey is to be created. Note that the dataKey will be created under the given materialName.

public EncryptionMaterialsProvider newProvider(final String materialName)  


Code snipped showing how the MS and MRP are stitched using Spring Config.


@Bean                                                                                                    
public AWSCredentialsProvider awsCredentialsProvider(){
    return new InstanceProfileCredentialsProvider();
}

@Bean                                                                                                    
public AmazonDynamoDBClient amazonDynamoDBClient() {
    AmazonDynamoDBClient amazonDynamoDBClient = new AmazonDynamoDBClient(awsCredentialsProvider());
    return amazonDynamoDBClient;
}

@Bean
public AWSKMS awsKms(){
    AWSKMS awsKMS = new AWSKMSClient(awsCredentialsProvider());
    awsKMS.setEndpoint(kmsEndpoint);
    return awsKMS;
}

@Bean
public EncryptionMaterialsProvider directKMSMaterialsProvider(){
    return new DirectKmsMaterialProvider(awsKms(), kmsCustomerMasterKeyAlias);
}

@Bean
public DynamoDBEncryptor dynamoDBEncryptor(AttributeEncryptor attributeEncryptor){
    return attributeEncryptor.getEncryptor();
}

@Bean
public AttributeEncryptor attributeEncryptor(EncryptionMaterialsProvider encryptionMaterialsProvider){
    return new AttributeEncryptor(encryptionMaterialsProvider);
}

@Bean
public MetaStore metaStore(){
    //Attribute encryptor with DirectKMSMaterialsProvider
    AttributeEncryptor attributeEncryptor = new AttributeEncryptor(directKMSMaterialsProvider());
    return new MetaStore(amazonDynamoDBClient(), keyStoreTable, dynamoDBEncryptor(attributeEncryptor));
}

@Bean
public EncryptionMaterialsProvider mostRecentProvider(MetaStore providerStore, String materialName, String ttlInMills){
    return new MostRecentProvider(providerStore, materialName, Long.parseLong(ttlInMills));
}

@Bean
public DynamoDB dynamoDB() {
    return new DynamoDB(amazonDynamoDBClient());
}

@Bean
public  DynamoDBMapper dataMapper() {

    DynamoDBMapperConfig mapperConfig = cacheManagerConfig.prepareMapperConfigWithTableNameResolver(CACHE_MANAGER_PROFILE);
    EncryptionMaterialsProvider mostRecentProvider = mostRecentProvider(metaStore(), materialName, ttlInSecs);

    return new DynamoDBMapper(amazonDynamoDBClient(), mapperConfig, attributeEncryptor(mostRecentProvider));
}

Snippet is self explanatory. Hope this is helpful!! If any queries, do get back to me in comments or to my mailId: mail2vinay.bs@gmail.com


AWS KMS: Explained

AWS Key Management Service (AWS KMS) is a managed service used to create and control encryption keys used to encrypt data. AWS KMS can be integrated with other AWS services like EBS, S3, DynamoDB etc.

Key Concepts of AWS KMS at a glance


Customer Master Keys

Primary resource of AWS KMS is Customer Master Keys (CMK). CMKs are either customer-managed or AWS-Managed. CMKs can be used either to protect upto 4KB data directly. Best way is to use CMK to protect dataKeys, which in-turn are used to protect actual data. CMKs never leave AWS KMS unencrypted.
CMK can be rotated once a year if its customer-managed.

Data Keys

Data keys are used to protect actual data. In envelope encryption, data Keys will be encrypted by CMK. And the encrypted data keys are used to encrypt/decrypt actual data. AWS KMS offers APIs to create data keys. While AWS KMS APIs can be used to generate, encrypt and decrypt data keys, AWS KMS will not store, manage or track your data keys. It has to be done in the application.

Envelope Encryption

AWS KMS uses envelope encryption to protect data. Envelope encryption is a practice of encrypting plain text data with a unique data key, and then data key is encrypted using key encryption key KEK). There can be multiple levels of KEKs. That is, we can choose to encrypt KEK with another KEK. But ultimately, the KEK has to be encrypted by a master key. Master key is an unencrypted(plaintext) key with which you can decrypt one or more keys.
In KMS, master key is called Customer Master Key (CMK). 
Envelope Encryption offers following advantages.
  1. It protects data keys.
  2. Option to encrypt same data using multiple master keys.

Following image provides an overview of how envelope encryption works in AWS KMS.


        Envelope encryption


Encryption Context

All AWS KMS cryptographic operations (encryption/decryption) accepts an optional set of key-value pairs that can contain additional contextual information about data. This set of key-value pairs is called encryption context. Encryption context used for encryption should be same for decryption of the data for the decryption to succeed. Encryption context is not secret. It can be logged and can be used for auditing and controlling access to AWS KMS API operation.


Refer to this link to understand how Envelope Encryption Works.

Source: http://docs.aws.amazon.com/kms/latest/developerguide/overview.html
 http://docs.aws.amazon.com/kms/latest/developerguide/concepts.html 
http://docs.aws.amazon.com/kms/latest/developerguide/workflow.html