AWS PrivateLink + CloudFormation setup questions from reading the docs

1.9k Views Asked by At

What I'm doing

I am trying to do this:

Launch tasks in a private subnet and make sure you have AWS PrivateLink endpoints configured in your VPC, for the services you need (ECR for image pull authentication, S3 for image layers, and AWS Secrets Manager for secrets).

My understanding of this is that AWS services act as a "VPC Endpoint Service" and all that I need to do is set up a "Interface VPC endpoint" to make my service a "service consumer" as described here: https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-interface.html

I have tried to implement this in CloudFormation, but I have a few questions from reading the documentation.

My Questions

Question 1

The documentation explains how to create the Interface VPC Endpoints, which is great. But it also says: "To turn on private DNS for the interface endpoint, for Enable DNS Name, select the check box." and "This option is turned on by default"

But over here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-vpcendpoint.html#cfn-ec2-vpcendpoint-privatednsenabled

It says: "Default: false". Which is it?

Question 2

I need to enable 3 ServiceNames. So... do I need to repeat this 3 times? My YAML which repeats the AWS::EC2::VPCEndpoint 3 times is below. Is this really correct? It seems too long / verbose.

privateVPCEndpoint1:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.dkr
      PrivateDnsEnabled: True
      # "If this parameter is not specified, we attach a default policy that allows full access to the service."
      # PolicyDocument:
      SecurityGroupIds:
        - !Ref ECSSecurityGroupDownloadRedisContainer
      SubnetIds:
        - !Ref privateSubnet1
        - !Ref privateSubnet2
      VpcEndpointType: Interface
      VpcId: !Ref VPC
  privateVPCEndpoint2:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.api
      PrivateDnsEnabled: True
      SecurityGroupIds:
        - !Ref ECSSecurityGroupDownloadRedisContainer
      SubnetIds:
        - !Ref privateSubnet1
        - !Ref privateSubnet2
      VpcEndpointType: Interface
      VpcId: !Ref VPC
  privateVPCEndpoint3:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.s3
      PrivateDnsEnabled: True
      SecurityGroupIds:
        - !Ref ECSSecurityGroupDownloadRedisContainer
      SubnetIds:
        - !Ref privateSubnet1
        - !Ref privateSubnet2
      VpcEndpointType: Interface
      VpcId: !Ref VPC

Question 3

For Security group, select the security groups to associate with the endpoint network interfaces.

Do I use the ECSSecurityGroupDownloadRedisContainer security group which is attached to my ECS Service via NetworkConfiguration / AwsvpcConfiguration / SecurityGroups? If yes, do I need to associate both ECSSecurityGroupDownloadRedisContainer (which allows traffic on 443) and ECSSecurityGroupRedis (which allows traffic on 6379)? I assume the answer to this is yes + only ECSSecurityGroupDownloadRedisContainer but I don't really know.

Question 4

Can I somehow disable access to ECS on port 443 after the container has been downloaded? I only need access to 6379 for Redis; anything else seems like a security liability to me.

Background: Why I'm Doing This

I am trying to create a ECS cluster + Service + Task, but I am getting the error:

(CannotPullContainerError: inspect image has been retried 5 time(s): failed to resolve ref "docker.io/library/redis:latest": failed to do request: Head https://registry-1.docker.io/v2/library/redis/manifests/latest: dial tcp 34.231.251.252:443: i/o timeout)

Research has pointed me to this post: Aws ecs fargate ResourceInitializationError: unable to pull secrets or registry auth

With this authoritative answer by an AWS employee nathan peck from March of this year: https://stackoverflow.com/a/66802973

They suggest one of three resolutions:

  • Launch tasks into a public subnet, with a public IP address, so that they can communicate to ECR and other backing services using an internet gateway
  • Launch tasks in a private subnet that has a VPC routing table configured to route outbound traffic via a NAT gateway in a public subnet. This way the NAT gateway can open a connection to ECR on behalf of the task.
  • Launch tasks in a private subnet and make sure you have AWS PrivateLink endpoints configured in your VPC, for the services you need (ECR for image pull authentication, S3 for image layers, and AWS Secrets Manager for secrets).

As you know, redis operates on port 6379, not port 443. My thoughts on these solutions:

  • Option 1 is very dangerous! I should NEVER be forced to expose my database instance to the public internet. So that's out.
  • Option 2 is what I started to implement, and then I realized that this involved exposing and allowing traffic on port 443 in my subnet + routing table + etc. That seems like an unnecessary security risk when I'm only going to be using port 443 @ container startup.
  • Option 3 seems like the right thing to do.

Thus, my journey.

1

There are 1 best solutions below

0
On

I got this to work. Some answers are below.

For Question 1:

The documentation explains how to create the Interface VPC Endpoints, which is great. But it also says: "To turn on private DNS for the interface endpoint, for Enable DNS Name, select the check box." and "This option is turned on by default"

But over here: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-vpcendpoint.html#cfn-ec2-vpcendpoint-privatednsenabled

It says: "Default: false". Which is it?

I do not know the answer. What I do know is that you should definitely turn on private DNS. Whereas the default DNS is something like: ec2.us-east-1.amazonaws.com ; the endpoint-specific regional or zonal DNS hostnames are longer and more complicated. More information: https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-interface.html#access-service-though-endpoint

For Question 2: NO! I only had to repeat it twice. Working answer:

privateVPCEndpoint1:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.dkr
      PrivateDnsEnabled: True
      # "If this parameter is not specified, we attach a default policy that allows full access to the service."
      # PolicyDocument:
      #   Version: 2012-10-17
      #   Statement:
      #     - Effect: Allow
      #       Principal: '*'
      #       Action:
      #         - ''
      #       Resource:
      #         - ''
      SecurityGroupIds:
        - !Ref ECSSecurityGroupDownloadRedisContainer
      SubnetIds:
        - !Ref privateSubnet1
        - !Ref privateSubnet2
      VpcEndpointType: Interface
      VpcId: !Ref VPC
  privateVPCEndpoint2:
    Type: AWS::EC2::VPCEndpoint
    Properties:
      ServiceName: !Sub com.amazonaws.${AWS::Region}.ecr.api
      PrivateDnsEnabled: True
      SecurityGroupIds:
        - !Ref ECSSecurityGroupDownloadRedisContainer
      SubnetIds:
        - !Ref privateSubnet1
        - !Ref privateSubnet2
      VpcEndpointType: Interface
      VpcId: !Ref VPC

For Question 3:

That is exactly what you should do, I seem to have been correct.

For Question 4:

No. If the existing container(s) fail, then I'll have to spin up a new one. When that happens I need access on port 443. You can't toggle a firewall / security group's access level like that.