Error getting access token from metadata server at: http://metadata/computeMetadata/v1/instance/service-accounts/default/token

6.6k Views Asked by At

I have tried with p12 keyfile, it is successfully working and I was able to fetch data from gcs bucket. But with json keyfile sparksession is not getting the json config values. Instead, It is going for default metadata. I am using maven and IntelliJ for development. Below is the code snippet

def main(args: Array[String]): Unit = {
System.out.println("hello gcp connect")
System.setProperty("hadoop.home.dir", "C:/hadoop/")
val sparkSession =
  SparkSession.builder()
    .appName("my first project")
    .master("local[*]")
    .config("spark.hadoop.fs.gs.project.id", "shaped-radius-297301")
    .config("spark.hadoop.fs.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem")
    .config("spark.hadoop.fs.AbstractFileSystem.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS")
    .config("spark.hadoop.google.cloud.project.id", "shaped-radius-297301")
    .config("spark.hadoop.google.cloud.auth.service.account.enable", "true")
    .config("spark.hadoop.google.cloud.auth.service.account.email", "[email protected]")
    .config("spark.hadoop.google.cloud.service.account.json.keyfile", "C:/Users/shaped-radius-297301-5bf673d7f0d2.json")
    .getOrCreate()
    sparkSession.sparkContext.addFile("gs://test_bucket/sample1.csv")
    sparkSession.read.csv(SparkFiles.get("sample1.csv")).show()
2

There are 2 best solutions below

4
On

You need to work on your configurations. From the image you provided, it looks like your service account email and service account key are not correct. Please make sure that you are using a correct service account email with Cloud Storage Admin role on IAM for example:

[email protected]

And the path of your service account key should be a directory that can be seen by your config, the "path to json" should be a directory where your key is currently located.

Also, make sure that you are using a bucket that exists on your project or else you'll get errors like "bucket does not exist" or "access denied".

UPDATE

OP updated the question, refer to this link. It is possible that GOOGLE_APPLICATION_CREDENTIALS is pointing to the wrong location, or may not have right IAM permissions.

0
On

There is problem while setting credential file, key file vin data bricks so i used

libraryDependencies += "com.github.samelamin" %% "spark-bigquery" % "0.2.6" to setup in one notebook in scala

import com.samelamin.spark.bigquery._

// Set up GCP credentials
sqlContext.setGcpJsonKeyFile("<JSON_KEY_FILE>")

// Set up BigQuery project and bucket
sqlContext.setBigQueryProjectId("<BILLING_PROJECT>")
sqlContext.setBigQueryGcsBucket("<GCS_BUCKET>")

and we are able to connect to google correctly with other notebook via python