I have been working for nearly 5 straight days now on this and can't get this to work. According to AWS documentation I should* be able to mount an EFS Volume to a pod deployed to a fargate node in kubernetes (EKS).
I'm doing everything 100% through terraform. I'm lost at this point and my eyes are practically bleeding from the amount of terrible documentation I have read. Any guidance that anyone can give me on getting this to work would be amazing!
Here is what I have done so far:
- Setup an EKS CSI driver, storage class, and role bindings (not really sure why I need these role bindings tbh)
resource "kubernetes_csi_driver" "efs" {
metadata {
name = "efs.csi.aws.com"
}
spec {
attach_required = false
volume_lifecycle_modes = [
"Persistent"
]
}
}
resource "kubernetes_storage_class" "efs" {
metadata {
name = "efs-sc"
}
storage_provisioner = kubernetes_csi_driver.efs.metadata[0].name
reclaim_policy = "Retain"
}
resource "kubernetes_cluster_role_binding" "efs_pre" {
metadata {
name = "efs_role_pre"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = "cluster-admin"
}
subject {
kind = "ServiceAccount"
name = "default"
namespace = "pre"
}
}
resource "kubernetes_cluster_role_binding" "efs_live" {
metadata {
name = "efs_role_live"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = "cluster-admin"
}
subject {
kind = "ServiceAccount"
name = "default"
namespace = "live"
}
}
- Setup The EFS Volume with policies and security groups
module "vpc" {
source = "../../read_only_data/vpc"
stackname = var.vpc_stackname
}
resource "aws_efs_file_system" "efs_data" {
creation_token = "xva-${var.environment}-pv-efsdata-${var.side}"
# encrypted = true
# kms_key_id = ""
performance_mode = "generalPurpose" #maxIO
throughput_mode = "bursting"
lifecycle_policy {
transition_to_ia = "AFTER_30_DAYS"
}
}
data "aws_efs_file_system" "efs_data" {
file_system_id = aws_efs_file_system.efs_data.id
}
resource "aws_efs_access_point" "efs_data" {
file_system_id = aws_efs_file_system.efs_data.id
}
/* Policy that does the following:
- Prevent root access by default
- Enforce read-only access by default
- Enforce in-transit encryption for all clients
*/
resource "aws_efs_file_system_policy" "efs_data" {
file_system_id = aws_efs_file_system.efs_data.id
policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "elasticfilesystem:ClientMount",
"Resource": aws_efs_file_system.efs_data.arn
},
{
"Effect": "Deny",
"Principal": {
"AWS": "*"
},
"Action": "*",
"Resource": aws_efs_file_system.efs_data.arn,
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}
}
}
]
})
}
# Security Groups for this volume
resource "aws_security_group" "allow_eks_cluster" {
name = "xva-${var.environment}-efsdata-${var.side}"
description = "This will allow the cluster ${data.terraform_remote_state.cluster.outputs.eks_cluster_name} to access this volume and use it."
vpc_id = module.vpc.vpc_id
ingress {
description = "NFS For EKS Cluster ${data.terraform_remote_state.cluster.outputs.eks_cluster_name}"
from_port = 2049
to_port = 2049
protocol = "tcp"
security_groups = [
data.terraform_remote_state.cluster.outputs.eks_cluster_sg_id
]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "allow_tls"
}
}
# Mount to the subnets that will be using this efs volume
# Also attach sg's to restrict access to this volume
resource "aws_efs_mount_target" "efs_data-app01" {
file_system_id = aws_efs_file_system.efs_data.id
subnet_id = module.vpc.private_app_subnet_01
security_groups = [
aws_security_group.allow_eks_cluster.id
]
}
resource "aws_efs_mount_target" "efs_data-app02" {
file_system_id = aws_efs_file_system.efs_data.id
subnet_id = module.vpc.private_app_subnet_02
security_groups = [
aws_security_group.allow_eks_cluster.id
]
}
- Create a Persistant Volume referencing the EFS Volume in kubernetes
data "terraform_remote_state" "csi" {
backend = "s3"
config = {
bucket = "xva-${var.account_type}-terraform-${var.region_code}"
key = "${var.environment}/efs/driver/terraform.tfstate"
region = var.region
profile = var.profile
}
}
resource "kubernetes_persistent_volume" "efs_data" {
metadata {
name = "pv-efsdata"
labels = {
app = "example"
}
}
spec {
access_modes = ["ReadOnlyMany"]
capacity = {
storage = "25Gi"
}
volume_mode = "Filesystem"
persistent_volume_reclaim_policy = "Retain"
storage_class_name = data.terraform_remote_state.csi.outputs.storage_name
persistent_volume_source {
csi {
driver = data.terraform_remote_state.csi.outputs.csi_name
volume_handle = aws_efs_file_system.efs_data.id
read_only = true
}
}
}
}
- Then create a deployment to fargate with the pod mounting the EFS volume
data "terraform_remote_state" "efs_data_volume" {
backend = "s3"
config = {
bucket = "xva-${var.account_type}-terraform-${var.region_code}"
key = "${var.environment}/efs/volume/terraform.tfstate"
region = var.region
profile = var.profile
}
}
resource "kubernetes_persistent_volume_claim" "efs_data" {
metadata {
name = "pv-efsdata-claim-${var.side}"
namespace = var.side
}
spec {
access_modes = ["ReadOnlyMany"]
storage_class_name = data.terraform_remote_state.csi.outputs.storage_name
resources {
requests = {
storage = "25Gi"
}
}
volume_name = data.terraform_remote_state.efs_data_volume.outputs.volume_name
}
}
resource "kubernetes_deployment" "example" {
timeouts {
create = "3m"
update = "4m"
delete = "2m"
}
metadata {
name = "deployment-example"
namespace = var.side
labels = {
app = "example"
platform = "fargate"
subnet = "app"
}
}
spec {
replicas = 1
selector {
match_labels = {
app = "example"
}
}
template {
metadata {
labels = {
app = "example"
platform = "fargate"
subnet = "app"
}
}
spec {
volume {
name = "efs-data-volume"
persistent_volume_claim {
claim_name = kubernetes_persistent_volume_claim.efs_data.metadata[0].name
read_only = true
}
}
container {
image = "${var.nexus_docker_endpoint}/example:${var.docker_tag}"
name = "example"
env {
name = "environment"
value = var.environment
}
env {
name = "dockertag"
value = var.docker_tag
}
volume_mount {
name = "efs-data-volume"
read_only = true
mount_path = "/appconf/"
}
# liveness_probe {
# http_get {
# path = "/health"
# port = 443
# }
# initial_delay_seconds = 3
# period_seconds = 3
# }
port {
container_port = 443
}
}
}
}
}
}
It can see the persistant volume in kuberenetes, I can see that it is claimed, heck I can even see that it attempts to mount the volume in the pod logs. However, I inevitably always see the following error when describing the pod:
Volumes:
efs-data-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pv-efsdata-claim-pre
ReadOnly: true
...
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 11m (x629 over 23h) kubelet, <redacted-fargate-endpoint> Unable to attach or mount volumes: unmounted volumes=[efs-data-volume], unattached volumes=[efs-data-volume]: timed out waiting for the condition
Warning FailedMount 47s (x714 over 23h) kubelet, <redacted-fargate-endpoint> MountVolume.SetUp failed for volume "pv-efsdata" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = InvalidArgument desc = Volume capability not supported
I finally have done it. I have successfully mounted an EFS Volume to a Fargate Pod (nearly 6 days later)! I was able to get the direction I need from this closed github issue: https://github.com/aws/containers-roadmap/issues/826
It ended up being that I am using this module to build my eks cluster: https://registry.terraform.io/modules/cloudposse/eks-cluster/aws/0.29.0?tab=outputs
If you use the output "security_group_id" it outputs the "Additional Security group". Which in my experience is good for absolutely nothing in aws. Not sure why it even exists when you can't do anything with it. The security group I needed to use was the "Cluster security group". So I added the "Cluster security group"'s id on port 2049 ingress rule on the EFS volumes security groups mount point and BAM! I mounted the EFS volume to the deployed pod successfully.
The other important change was I changed the persistant volume type to be ReadWriteMany, since fargate apparently doesn't support ReadOnlyMany.