Redis 6 ACL in AWS with Terraform

Photo by Tapio Haaja on Unsplash

Redis 6 ACL in AWS with Terraform

Redis 6 ACL

With version 6 Redis introduced Access Control List feature that allows us to increase security of our Redis instances by creating users and specifying permissions. This way you can restrict access for any particular user only to minimal required keys and actions. For backward compatibility the default user is still available, so updating to Redis 6 doesn't require any modification on the application side until you decide to use the new approach. The default user can also be disabled when you finally create users specific for your applications, so no unauthorized connections will be possible.

Redis 6 is now available in AWS and ACL feature is called Role-Based Access Control which may be confusing, because it doesn't have anything in common with IAM Roles. AWS introduced API for creating users in ElastiCache so it's possible to setup you Redis 6 Cluster using CloudFormation, boto3 or Terraform.

Let's start by creating a simple Redis Cluster using CloudFormation to check what has been added.

Basic setup with CloudFormation

In the following example i will create my new Redis 6 Cluster with user demo-redis-user, the default user is still there for now, and as you can see it needs to be listed in UserIds field of AWS::ElastiCache::UserGroup, otherwise the error will appear:

"User group needs to contain a user with the user name default"

So here it is, some code:

  DemoUserPassword:
    Type: AWS::SecretsManager::Secret
    Properties:
      GenerateSecretString:
        ExcludePunctuation: true
        PasswordLength: 128
  DemoRedisUser:
    Type: AWS::ElastiCache::User
    Properties:
      Engine: "redis"
      AccessString: "on ~* -@all +@read +@write"
      UserName: "demo-redis-user"
      UserId: "demo-redis-user"
      Passwords:
        - !Sub '{{resolve:secretsmanager:${DemoUserPassword}:SecretString}}'
  RedisUserGroup:
    Type: AWS::ElastiCache::UserGroup
    Properties:
      Engine: "redis"
      UserGroupId: "demo-redis-user-group"
      UserIds:
        - !Ref DemoRedisUser
        - "default"
  DemoReplicationGroup:
    Type: AWS::ElastiCache::ReplicationGroup
    Properties:
      Engine: "redis"
      EngineVersion: "6.x" # For Redis 6 it can be 6.x for latest or specific version like 6.2
      CacheNodeType: "cache.t3.micro"
      ReplicationGroupDescription: "Demo Replication Group"
      NumCacheClusters: 2
      AutomaticFailoverEnabled: true
      TransitEncryptionEnabled: true # Required for user group based access
      SecurityGroupIds:
        - !GetAtt DemoSecurityGroup.GroupId
      UserGroupIds:
        - !Ref RedisUserGroup

As you can see AWS::ElastiCache::User requires assigning up to two passwords for our new user (or it can be omitted by setting NoPasswordRequired to true). Also TransitEncryptionEnabled needs to be true when you want to use ACL users, so the connection will be encrypted, otherwise you will get an error during resource creation.

AccessString allows you to restrict user actions only to particular keys and commands. Full syntax described HERE (it's slightly limited in AWS so also check THIS). This user with

AccessString: "on ~* -@all +@read +@write"

can read and write to any key. Other commands are prohibited, including management ones. You can be sure that no application will call some management command by accident.

Ok so here is the thing: Keep your secrets secret :D You can do it with AWS SecretsManager, to handle newly generated secret you just need to use CloudFormation dynamic reference:

!Sub '{{resolve:secretsmanager:${DemoUserPassword}:SecretString}}'

By doing that you can be sure that this secret will not leak, it will be resolved on server side by CloudFormation and after that will be only available in SecretsManager. Then your application can fetch the secret by calling "secretsmanager:GetSecretValue" action. Of course you need to create a proper IAM Role first.

Ok, the user is created so now it's time to disable default user, we don't want to create a user with super secret password and still allow anybody to access our cluster:

  DisabledDefaultUser:
    Type: AWS::ElastiCache::User
    Properties:
      NoPasswordRequired: true
      Engine: "redis"
      AccessString: "off ~* -@all"
      UserName: "default" # Has to be 'default' to swap with the default user
      UserId: "disabled-default-user"
  RedisUserGroup:
    Type: AWS::ElastiCache::UserGroup
    Properties:
      Engine: "redis"
      UserGroupId: "demo-redis-user-group"
      UserIds:
        - !Ref DemoRedisUser
        - !Ref DisabledDefaultUser

So default user disappeared from UserIds list. This user will be swapped with the new one. Our new disabled user should have UserName field set to default to replace the existing one. Also UserId SHOULD BE DIFFERENT than default. Otherwise CloudFormation will return an error.

Passwords are also not required, because this user will not be used after all, NoPasswordRequired can be set to true. AccessString is:

AccessString: "off ~* -@all"

And the default user is switched off.

Side Note

As you can see i didn't specify SubnetGroup so this Redis Cluster will be created in default VPC, also i removed cluster security group from the listing to keep it small. Remember that in your production environment you should rather deploy your cluster in custom VPC, private subnets and with a security group that allows access only from your application.

Time for Terraform

Terraform is very powerful and used by many organizations. In my opinion it's easier to manage bigger infrastructure with Terraform, because it allows you to create modules (load files, has template engine and many more). So let's try to create our Redis Cluster with Terraform.

Wait, but there is one problem: Terraform uses API, and stores everything in state file, including secrets, so if you follow the same approach and create a password using SecretsManager it will be visible in the state file. Of course you can use AWS Backend and store your state file in encrypted S3 Bucket, but still you will have your secret stored in two different places, and it may happen that some user will be able to access this bucket and read the secret even without permission to read SecretsManager values. We would like to avoid this situation.

So here is my solution: use Terraform resource called aws_cloudformation_stack to create user with password, then export it and fetch in Terraform:

Parameters:
  UserName:
    Type: String
Resources:
  DemoUserPassword:
    Type: AWS::SecretsManager::Secret
    Properties:
      GenerateSecretString:
        ExcludePunctuation: true
        PasswordLength: 128
  DemoRedisUser:
    Type: AWS::ElastiCache::User
    Properties:
      Engine: "redis"
      AccessString: "on ~* -@all +@read +@write"
      UserName: !Ref UserName
      UserId: !Ref UserName
      Passwords:
        - !Sub '{{resolve:secretsmanager:${DemoUserPassword}:SecretString}}'
Outputs:
  DemoRedisUserId:
    Value: !Ref DemoRedisUser
  DemoRedisUserPasswordArn:
    Value: !Ref DemoUserPassword

Now the password is resolved on the AWS side and the stack returns only proper ARN that we can pass to our application using environment variables.

Because Terraform can load files you can just write CloudFormation YAML file and fetch it:

resource "aws_cloudformation_stack" "redis_user" {
  name = "redis-user"
  template_body = file("${path.module}/user.yaml")
  parameters = {
    UserName = local.user_name
  }
}

Then you can create disabled-default-user and use them together in UserGroup:

resource "aws_elasticache_user" "disabled_default_user" {
  access_string = "off ~* -@all"
  engine = "REDIS"
  no_password_required = true
  user_id = "disabled-default-user"
  user_name = "default"
}

resource "aws_elasticache_user_group" "redis_user_group" {
  engine = "REDIS"
  user_group_id = "demo-user-group"
  user_ids = [
    lookup(aws_cloudformation_stack.redis_user.outputs, "DemoRedisUserId", "default"),
    aws_elasticache_user.disabled_default_user.id
  ]
}

Outputs from the user stack are available through outputs variable of type map. Terraform provides lookup function to retrieve the value of a single element from a map.

Finally you can put it together in ReplicationGroup:

resource "aws_elasticache_replication_group" "demo-replication-group" {
  replication_group_description = "Demo Replication Group"
  replication_group_id = "demo-replication-group"
  engine = "redis"
  engine_version = "6.x"
  node_type = "cache.t3.micro"
  number_cache_clusters = 2
  automatic_failover_enabled = true
  transit_encryption_enabled = true
  security_group_ids = [aws_security_group.demo-security-group.id]
  user_group_ids = [aws_elasticache_user_group.redis_user_group.id]
}

By doing this Terraform will create CloudFormation stack and manage it, you will be able to deploy everything with just:

terraform apply

Now you can access the newly created Redis Cluster from your app, here is some Python AWS lambda code:

import boto3
import redis
import certifi
import os

def handle(event, context):
    port = int(os.environ['REDIS_PORT'])
    url = os.environ['REDIS_URL']
    secret_arn = os.environ['SECRET_ARN']
    sm_client = boto3.client('secretsmanager')
    user_pass = sm_client.get_secret_value(SecretId=secret_arn)['SecretString']
    user = os.environ['USER_NAME']
    client = redis.Redis(host=url,
                         port=port,
                         ssl=True,
                         ssl_ca_certs=certifi.where(),
                         username=user,
                         password=user_pass)
    client.set('foo', 'bar')
    return client.get('foo')

Because encryption in transit is enabled, ssl=True is required.

Also proper IAM Role is required, otherwise lambda will not be able to use secretsmanager:GetSecretValue action:

data "aws_iam_policy_document" "lambda_assume_role" {
  statement {
    effect = "Allow"
    actions = ["sts:AssumeRole"]
    principals {
      identifiers = ["lambda.amazonaws.com"]
      type        = "Service"
    }
  }
}

data "aws_iam_policy_document" "allow-secrets-manager-get-value" {
  statement {
    actions = ["secretsmanager:GetSecretValue"]
    effect = "Allow"
    resources = [var.secret_arn]
  }
}

resource "aws_iam_role" "lambda_role" {
  assume_role_policy = data.aws_iam_policy_document.lambda_assume_role.json
  inline_policy {
    name = "allow-secrets-manager-get-value" // Name required, will create empty policy otherwise
    policy = data.aws_iam_policy_document.allow-secrets-manager-get-value.json
  }
}

Side note

As you can see this ReplicationGroup also creates Redis instances in public subnet of the default VPC, AWS Lambda deployed without VPC configuration will not be able to access it. It needs to be started in the same VPC, also Lambda doesn't get Public IP so it will not access Internet and AWS services so it will not be able to fetch secret from SecretsManager, Lambda needs to be created in private subnet with route to NAT Gateway.

Conclusion

You can improve security of your ElastiCache Redis cluster by migrating to Redis 6 and introducing ACL (remember to disable default user).

Also now Redis Cluster can be easily shared between services as you can restrict user access only to particular key prefix:

AccessString: "on ~racoon/* -@all +@read +@write"

By doing this the Racoon microservice will be able to access only keys starting with racoon/, so no need to worry about accessing other service's keys anymore!!