Problem

Terraform GCP google_service_account and google_project_iam_binding resource to attach roles/editor deleted Google APIs Service Agent and GCP default compute engine default service account in the IAM principals. GKE cluster cannot be deleted / created due to the deletion in IAM principals, although it still remains in IAM Service Accounts.

The problem here is it disappears (which I wrote "deleted") from the IAM principals, and the Compute Engine default service account is compromised, hence no more able to manage Compute Engine, including GKE cluster/nodes.

Question

I believe this is a Terraform bug but please help understand if there are things I am missing which can prevent the problem.

Please also advise if there is a way to restore the Compute Engine default service account back in IAM principals with the Editor role.

Environment

$ terraform version
Terraform v1.0.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v4.6.0

.terraform.lock.hcl

# This file is maintained automatically by "terraform init".
# Manual edits may be lost in future updates.

provider "registry.terraform.io/hashicorp/google" {
  version = "4.6.0"
  hashes = [
    "h1:QbO4yjDrnoSpiYKSHrICNL1ZuWsl5J2rVRFj2kNg7xA=",
    "zh:005a28a2c79f6b29680b0f57260c69c85d8a992688007b6e5645149bd379951f",
    "zh:2604d825de72cf99b4899d7880837adeb19d371f48e419666e32c4c3cf6a72e9",
    "zh:290da4eb18e44469480cf299bebce89f54e4d301f856cdffe2837b498878c7ec",
    "zh:3e5ba1a55d38fa17533a18fc14a612e781ded76c6309734d3dc0a937be27eec1",
    "zh:4a85de3cdb33c092d8ccfced3d7302934de0dd4f72bbcebd79d45afe0a0b6f85",
    "zh:5fb1a79800833ae922aaba594a8b2bc83be1d254052e12e0ce8330ca0d8933d9",
    "zh:679b9f50c6fe0476e74d37935f7598d46d6e9612f75b26a8ef1ca3c13144d06a",
    "zh:893216e32378839668c51ef135af1676cd887d63e2edb6625cf9adad7bfa346f",
    "zh:ad8f2fd19adbe4c10281ba9b3c8d5100877a9c541d3580bbbe9357714aa77619",
    "zh:bff5d6fd15e98c12ee9ed98b0338761dc4a9ba671a37834926daeabf73c71783",
    "zh:debdf15fbed8d63e397cd004bf65586bd2b93ce04e47ca51a7c70c1fe9168b87",
  ]
}

Reproduction Steps

Tested twice in different GCP projects and the issue was reproduced in the same manner.

Start

In a GCP project, starts without Compute Engine enabled, hence no Compute Engine default service account.

enter image description here

enter image description here

Enable Compute Engine API.

enter image description here

Compute Engine default service account gets created and appears both in IAM Principals and IAM Service Accounts.

enter image description here

enter image description here

Terraform apply

Apply the terraform script to create a service account with IAM bindings.

variable "PROJECT_ID" {
  type        = string
  description = "GCP Project ID"
  default     = "test-tf-sa"
}

variable "REGION" {
  type        = string
  description = "GCP Region"
  default     = "us-central1"
}


variable "roles_to_grant_to_service_account" {
  description = "IAM roles to grant to the service account"
  type        = list(string)
  default = [
    "roles/editor",
    "roles/iam.serviceAccountAdmin",
    "roles/resourcemanager.projectIamAdmin"
  ]
}

provider "google" {
  project = var.PROJECT_ID
  region  = var.REGION
}
resource "google_service_account" "terraform" {
  account_id   = "terraform"
  display_name = "terraform service account"
}

resource "google_project_iam_binding" "terraform" {
  project = var.PROJECT_ID

  #--------------------------------------------------------------------------------
  # Grant the service account to have the roles
  #--------------------------------------------------------------------------------
  members = [
    "serviceAccount:${google_service_account.terraform.email}"
  ]
  for_each = toset(var.roles_to_grant_to_service_account)
  role     = each.value
}

$ terraform apply --auto-approve

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_project_iam_binding.terraform["roles/editor"] will be created
  + resource "google_project_iam_binding" "terraform" {
      + etag    = (known after apply)
      + id      = (known after apply)
      + members = (known after apply)
      + project = "test-tf-sa"
      + role    = "roles/editor"
    }

  # google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be created
  + resource "google_project_iam_binding" "terraform" {
      + etag    = (known after apply)
      + id      = (known after apply)
      + members = (known after apply)
      + project = "test-tf-sa"
      + role    = "roles/iam.serviceAccountAdmin"
    }

  # google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be created
  + resource "google_project_iam_binding" "terraform" {
      + etag    = (known after apply)
      + id      = (known after apply)
      + members = (known after apply)
      + project = "test-tf-sa"
      + role    = "roles/resourcemanager.projectIamAdmin"
    }

  # google_service_account.terraform will be created
  + resource "google_service_account" "terraform" {
      + account_id   = "terraform"
      + disabled     = false
      + display_name = "terraform service account"
      + email        = (known after apply)
      + id           = (known after apply)
      + name         = (known after apply)
      + project      = (known after apply)
      + unique_id    = (known after apply)
    }

Plan: 4 to add, 0 to change, 0 to destroy.
google_service_account.terraform: Creating...
google_service_account.terraform: Creation complete after 2s [id=projects/test-tf-sa/serviceAccounts/[email protected]]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creating...
google_project_iam_binding.terraform["roles/editor"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creation complete after 9s [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/editor"]: Creation complete after 9s [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Still creating... [10s elapsed]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creation complete after 10s [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]

Apply complete! Resources: 4 added, 0 changed, 0 destroyed.

Terraform has deleted the Compute Engine default service account from the IAM principals

Immediately after the terraform apply, verify the IAM principals and the Compute Engine default service account has been deleted in the IAM principal view.

enter image description here

As suggested by @JohnHanley, clicked Include Google-provided role grants to unhide Google-managed service accounts. The original Compute Engine default service account [email protected] has gone in the IAM principals view.

enter image description here

The gcloud projects get-iam-policy command does not show the Compute Engine default service account [email protected].

$ GCP_PROJECT_ID=test-tf-sa
$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
- members:
  - serviceAccount:[email protected]
  role: roles/compute.admin
- members:
  - serviceAccount:[email protected]
  role: roles/compute.instanceAdmin
- members:
  - serviceAccount:[email protected]
  role: roles/compute.serviceAgent
- members:
  - serviceAccount:service-1079157603081@container-engine-robot.iam.gserviceaccount.com
  role: roles/container.serviceAgent
- members:
  - serviceAccount:[email protected]
  role: roles/containerregistry.ServiceAgent
- members:
  - serviceAccount:[email protected]
  role: roles/editor
- members:
  - user:****@gmail.com
  role: roles/owner
- members:
  - serviceAccount:[email protected]
  role: roles/pubsub.serviceAgent
etag: BwXVf2S5fCQ=
version: 1

The service account though still remains in the IAM Service Accounts menu.

enter code here

Create GKE

Enable the Kubernetes Engine API, and create a GKE cluster. At this point, the impact of Compute Engine default service account did not hinder the GKE creation. It may be because of the eventual consistency.

enter image description here

enter image description here

terraform destroy

Run terraform destroy.

$ terraform destroy --auto-approve
google_service_account.terraform: Refreshing state... [id=projects/test-tf-sa/serviceAccounts/[email protected]]
google_project_iam_binding.terraform["roles/editor"]: Refreshing state... [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Refreshing state... [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Refreshing state... [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # google_project_iam_binding.terraform["roles/editor"] has been changed
  ~ resource "google_project_iam_binding" "terraform" {
      ~ etag    = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
        id      = "test-tf-sa/roles/editor"
      ~ members = [
          + "serviceAccount:[email protected]",
            # (1 unchanged element hidden)
        ]
        # (2 unchanged attributes hidden)
    }
  # google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] has been changed
  ~ resource "google_project_iam_binding" "terraform" {
      ~ etag    = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
        id      = "test-tf-sa/roles/iam.serviceAccountAdmin"
        # (3 unchanged attributes hidden)
    }
  # google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] has been changed
  ~ resource "google_project_iam_binding" "terraform" {
      ~ etag    = "BwXVe+z+aCU=" -> "BwXVfBieTDw="
        id      = "test-tf-sa/roles/resourcemanager.projectIamAdmin"
        # (3 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to
undo or respond to these changes.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  # google_project_iam_binding.terraform["roles/editor"] will be destroyed
  - resource "google_project_iam_binding" "terraform" {
      - etag    = "BwXVfBieTDw=" -> null
      - id      = "test-tf-sa/roles/editor" -> null
      - members = [
          - "serviceAccount:[email protected]",
          - "serviceAccount:[email protected]",
        ] -> null
      - project = "test-tf-sa" -> null
      - role    = "roles/editor" -> null
    }

  # google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be destroyed
  - resource "google_project_iam_binding" "terraform" {
      - etag    = "BwXVfBieTDw=" -> null
      - id      = "test-tf-sa/roles/iam.serviceAccountAdmin" -> null
      - members = [
          - "serviceAccount:[email protected]",
        ] -> null
      - project = "test-tf-sa" -> null
      - role    = "roles/iam.serviceAccountAdmin" -> null
    }

  # google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be destroyed
  - resource "google_project_iam_binding" "terraform" {
      - etag    = "BwXVfBieTDw=" -> null
      - id      = "test-tf-sa/roles/resourcemanager.projectIamAdmin" -> null
      - members = [
          - "serviceAccount:[email protected]",
        ] -> null
      - project = "test-tf-sa" -> null
      - role    = "roles/resourcemanager.projectIamAdmin" -> null
    }

  # google_service_account.terraform will be destroyed
  - resource "google_service_account" "terraform" {
      - account_id   = "terraform" -> null
      - disabled     = false -> null
      - display_name = "terraform service account" -> null
      - email        = "[email protected]" -> null
      - id           = "projects/test-tf-sa/serviceAccounts/[email protected]" -> null
      - name         = "projects/test-tf-sa/serviceAccounts/[email protected]" -> null
      - project      = "test-tf-sa" -> null
      - unique_id    = "107173424725895843752" -> null
    }

Plan: 0 to add, 0 to change, 4 to destroy.
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Destroying... [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]
google_project_iam_binding.terraform["roles/editor"]: Destroying... [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Destroying... [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Destruction complete after 10s
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Destruction complete after 10s
google_project_iam_binding.terraform["roles/editor"]: Still destroying... [id=test-tf-sa/roles/editor, 10s elapsed]
google_project_iam_binding.terraform["roles/editor"]: Destruction complete after 11s
google_service_account.terraform: Destroying... [id=projects/test-tf-sa/serviceAccounts/[email protected]]
google_service_account.terraform: Destruction complete after 1s

Destroy complete! Resources: 4 destroyed.

Problems

Cannot delete GKE

The impact of the Compute Engine default service account deletion in IAM principals started.

enter image description here

Cannot delete GKE cluster with the error.

Google Compute Engine: Required 'compute.instanceGroups.update' permission for 'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp'.

enter image description here

$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
 - [cluster-1] in [us-central1-c]

Do you want to continue (Y/n)?  Y

Deleting cluster cluster-1...done.                                                                                                                                  
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
 - args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n endTime: \'2022-01-14T00:20:54.190004708Z\'\n error: <Status\n code: 7\n details: []\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n name: \'operation-1642119632548-20038ec5\'\n nodepoolConditions: []\n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642119632548-20038ec5\'\n startTime: \'2022-01-14T00:20:32.548792723Z\'\n status: StatusValueValuesEnum(DONE, 3)\n statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
   exit_code: 1

Cannot create GKE

Try to create another GKE cluster.

enter image description here

Cannot create GKE cluster anymore. This is the original issue GCP GKE - Google Compute Engine: Not all instances running in IGM I encountered which lead to this trouble shooting.

enter image description here

cluster-2
Google Compute Engine: Not all instances running in IGM after 18.798524988s. Expected 3, running 0, transitioning 3. Current errors: [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.instances.create' permission for 'projects/1079157603081/zones/us-central1-c/instances/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.create' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.setLabels' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.use' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '[email protected]') (truncated).

enter image description here


Attempts to fix

Tried these measures but no luck.

Reassign roles/Editor to the service account

GCP_PROJECT_ID=test-tf-sa
GCP_SVC_ACC="serviceAccount:[email protected]"

gcloud projects add-iam-policy-binding ${GCP_PROJECT_ID} \
    --member=serviceAccount:${GCP_SVC_ACC} \
    --role=roles/Editor
-----
ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/Editor is not supported for this resource.

Apply undelete service account

$ gcloud beta iam service-accounts undelete 109558708367309276392
restoredAccount:
  email: [email protected]
  etag: MDEwMjE5MjA=
  name: projects/test-tf-sa/serviceAccounts/[email protected]
  oauth2ClientId: '109558708367309276392'
  projectId: test-tf-sa
  uniqueId: '109558708367309276392'

They did not bring the Compute Engine default service account back to IAM principals.

enter image description here

Disable Compute Engine API

Tried to disable the Compute Engine API but as GKE nodes cannot be deleted, it cannot be disabled.

Manually add back the service account

Manually added Compute Engine account [email protected]" and added IAM roles/Editor. It is not appear in gcloud projects get-iam-policy command output, but still cannot delete the GKE cluster.

$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
...
- members:
  - serviceAccount:[email protected]           <-----
  - serviceAccount:[email protected]
  role: roles/editor
...
etag: BwXVf9cVnaU=
version: 1

$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
 - [cluster-1] in [us-central1-c]

Do you want to continue (Y/n)?  Y

Deleting cluster cluster-1...done.                                                                                                                                  
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
 - args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n 
 message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n 
 detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n 
 endTime: \'2022-01-14T00:33:38.746564953Z\'\n error: <Status\n code: 7\n details: []\n 
 message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n 
 name: \'operation-1642120382096-034b0eb7\'\n nodepoolConditions: []
 \n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n 
 selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642120382096-034b0eb7\'\n 
 startTime: \'2022-01-14T00:33:02.096736326Z\'\n status: StatusValueValuesEnum(DONE, 3)\n 
 statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n 
 targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n 
 zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for 
 \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
   exit_code: 1

enter image description here

Another service account for GKE

Created another service account that has compute.admin roles, and used it to create/delete the GKE cluster(s). However, once the Compute Engine default service account has been compromised, keep having the GCP GKE - Google Compute Engine: Not all instances running in IGM issue.


Goal to achieve

Bring the Compute Engine default service account back into the IAM principals like in the snapshot below, and be able to manage Compute Engines and GKE nodes.

enter image description here

2

There are 2 best solutions below

1
On

You can restore the service accounts using the “gcloud beta iam service-accounts undelete” command.

If you accidentally delete a service account, you can try to undelete the service account instead of creating a new service account.

Please review this link if you need more info. You may notice that in order to restore a deleted account you may need the 21 digit unique ID. If you do not have this ID for the account, you could try this command :

gcloud logging read --freshness=30d --format='table(timestamp,resource.labels.email_id,resource.labels.project_id,resource.labels.unique_id)' protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccount" resource.type="service_account" logName:"cloudaudit.googleapis.com%2Factivity"'

or this command:

gcloud logging read --freshness=30d protoPayload.methodName="google.iam.admin.v1.DeleteServiceAccount" | grep 'email_id|unique_id'

0
On

Related issues

I wish I had read these before getting into this issue as another bites the sand.

Usability improvements for *_iam_policy and *_iam_binding resources #8354

Description I'm sure you know by now there is a decent amount of care required when using the *_iam_policy and *_iam_binding versions of IAM resources. There are a number of "be careful!" and "note" warnings in the resources that outline some of the potential pitfalls, but there are hidden dangers as well. For example, using the google_project_iam_policy resource may inadvertently remove Google's service agents' (https://cloud.google.com/iam/docs/service-agents) IAM roles from the project. Or, the dangers of using google_storage_bucket_iam_policy and google_storage_bucket_iam_binding, which may remove the default IAM roles granted to projectViewers:, projectEditors:, and projectOwners: of the containing project.

The largest issue I encounter with people running into the above situations is that the initial terraform plan does not show that anything is being removed. While the documentation for google_project_iam_policy notes that it's best to terraform import the resource beforehand, this is in fact applicable to all *_iam_policy and *_iam_binding resources. Unfortunately this is tedious, potentially forgotten, and not something that you can abstract away in a Terraform module.

Cause

As @toteem pointed out

google_project_iam_binding resource is Authoritative which mean it will delete any binding that is NOT explicitly specified in the terraform configuration.

Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the project are preserved.

Not sure who can get the clear idea what terraform does with google_project_iam_binding but as GCP has identified, Terraform google_project_iam_binding has deleted all the accounts not in the members attribute that have "roles/Editor" role.

Still, I believe this is a terraform defect.

As per the Google APIs Service Agent document, it is the essential service accounts that GCP internally manages. Terraform should not delete any such GCP managed internal service accounts as it bring the GCP projects down. I doubt in what use cases do we need this to happen.

Some Google Cloud services need access to your resources so that they can act on your behalf. For example, when you use Cloud Run to run a container, the service needs access to any Pub/Sub topics that can trigger the container.

To meet this need, Google creates and manages service accounts for many Google Cloud services. These service accounts are known as Google-managed service accounts. You might see Google-managed service accounts in your project's IAM policy, in audit logs, or on the IAM page in the Cloud Console.

Google-managed service accounts are not listed in the Service accounts page in the Cloud Console.

Google APIs Service Agent. Your project is likely to contain a service account named the Google APIs Service Agent, with an email address that uses the following format: [email protected]

This service account runs internal Google processes on your behalf. It is automatically granted the Editor role (roles/editor) on the project.

Solution

Use google_project_iam_member.

#--------------------------------------------------------------------------------
# Service Account Roles
# Need roles/resourcemanager.projectIamAdmin to be able to execute this.
#--------------------------------------------------------------------------------
# resource "google_project_iam_binding" "terraform" {
#   project = var.PROJECT_ID
#
#   #--------------------------------------------------------------------------------
#   # Grant the service account to have the roles
#   #--------------------------------------------------------------------------------
#   members = [
#     "serviceAccount:${google_service_account.terraform.email}"
#   ]
#   for_each = toset(var.roles_to_grant_to_service_account)
#   role     = each.value
# }

#--------------------------------------------------------------------------------
# Service Account Roles
# Need roles/resourcemanager.projectIamAdmin to be able to execute this.
#--------------------------------------------------------------------------------
resource "google_project_iam_member" "terraform" {
  project = local.PROJECT_ID

  #--------------------------------------------------------------------------------
  # Grant the service account to have the roles
  #--------------------------------------------------------------------------------
  member   = "serviceAccount:${google_service_account.terraform.email}"
  for_each = toset(var.roles_to_grant_to_service_account)
  role     = each.value
}

Fix

In case the GCP internal service accounts have been deleted by google_project_iam_binding.

According to GCP:

To fix this issue you can add the service agent in the IAM page using the Add option at the top. The principal will be "${PROJECT_ID}@cloudservices.gserviceaccount.com" and add the editor role.

As per the error message, add '[email protected]' in IAM.

'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '[email protected]') (truncated).

The Google APIs Service Agent is restored in the view.

enter image description here

Create GKE.

enter image description here

Conclusion

I would never use them as I doubt if any use cases exist which we need to destroy other accounts that have the same roles.

  • google_project_iam_member
  • google_service_account_iam_binding