Octavia apply Airbyte gives

101 Views Asked by At

I'm trying to create a new BigQuery destination on Airbyte with Octavia cli.

When launching:

octavia apply

I receive:

Error: {"message":"The provided configuration does not fulfill the specification. Errors: json schema validation failed when comparing the data to the json schema. \nErrors:
$.loading_method.method: must be a constant value Standard

Here is my conf:

# Configuration for airbyte/destination-bigquery
# Documentation about this connector can be found at https://docs.airbyte.com/integrations/destinations/bigquery
resource_name: "BigQueryFromOctavia"
definition_type: destination
definition_id: 22f6c74f-5699-40ff-833c-4a879ea40133
definition_image: airbyte/destination-bigquery
definition_version: 1.2.12

# EDIT THE CONFIGURATION BELOW!
configuration:
  dataset_id: "airbyte_octavia_thibaut" # REQUIRED | string | The default BigQuery Dataset ID that tables are replicated to if the source does not specify a namespace. Read more <a href="https://cloud.google.com/bigquery/docs/datasets#create-dataset">here</a>.
  project_id: "data-airbyte-poc" # REQUIRED | string | The GCP project ID for the project containing the target BigQuery dataset. Read more <a href="https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects">here</a>.
  loading_method:
    ## -------- Pick one valid structure among the examples below: --------
    # method: "Standard" # REQUIRED | string
    ## -------- Another valid structure for loading_method: --------
    method: "GCS Staging" # REQUIRED | string}
    credential:
      ## -------- Pick one valid structure among the examples below: --------
      credential_type: "HMAC_KEY" # REQUIRED | string
      hmac_key_secret: ${AIRBYTE_BQ1_HMAC_KEY_SECRET} # SECRET (please store in environment variables) | REQUIRED | string | The corresponding secret for the access ID. It is a 40-character base-64 encoded string. | Example: 1234567890abcdefghij1234567890ABCDEFGHIJ
      hmac_key_access_id: ${AIRBYTE_BQ1_HMAC_KEY_ACCESS_ID} # SECRET (please store in environment variables) | REQUIRED | string | HMAC key access ID. When linked to a service account, this ID is 61 characters long; when linked to a user account, it is 24 characters long. | Example: 1234567890abcdefghij1234
      gcs_bucket_name: "airbyte-octavia-thibaut-gcs" # REQUIRED | string | The name of the GCS bucket. Read more <a href="https://cloud.google.com/storage/docs/naming-buckets">here</a>. | Example: airbyte_sync
      gcs_bucket_path: "gcs" # REQUIRED | string | Directory under the GCS bucket where data will be written. | Example: data_sync/test
    # keep_files_in_gcs-bucket: "Delete all tmp files from GCS" # OPTIONAL | string | This upload method is supposed to temporary store records in GCS bucket. By this select you can chose if these records should be removed from GCS when migration has finished. The default "Delete all tmp files from GCS" value is used if not set explicitly.
  credentials_json: ${AIRBYTE_BQ1_CREDENTIALS_JSON} # SECRET (please store in environment variables) | OPTIONAL | string | The contents of the JSON service account key. Check out the <a href="https://docs.airbyte.com/integrations/destinations/bigquery#service-account-key">docs</a> if you need help generating this key. Default credentials will be used if this field is left empty.
  dataset_location: "europe-west1" # REQUIRED | string | The location of the dataset. Warning: Changes made after creation will not be applied. Read more <a href="https://cloud.google.com/bigquery/docs/locations">here</a>.
  transformation_priority: "interactive" # OPTIONAL | string | Interactive run type means that the query is executed as soon as possible, and these queries count towards concurrent rate limit and daily limit. Read more about interactive run type <a href="https://cloud.google.com/bigquery/docs/running-queries#queries">here</a>. Batch queries are queued and started as soon as idle resources are available in the BigQuery shared resource pool, which usually occurs within a few minutes. Batch queries don’t count towards your concurrent rate limit. Read more about batch queries <a href="https://cloud.google.com/bigquery/docs/running-queries#batch">here</a>. The default "interactive" value is used if not set explicitly.
  big_query_client_buffer_size_mb: 15 # OPTIONAL | integer | Google BigQuery client's chunk (buffer) size (MIN=1, MAX = 15) for each table. The size that will be written by a single RPC. Written data will be buffered and only flushed upon reaching this size or closing the channel. The default 15MB value is used if not set explicitly. Read more <a href="https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html">here</a>. | Example: 15
1

There are 1 best solutions below

0
On

It was an indentation issue on my side:

gcs_bucket_name: "airbyte-octavia-thibaut-gcs" # REQUIRED | string | The name of the GCS bucket. Read more <a href="https://cloud.google.com/storage/docs/naming-buckets">here</a>. | Example: airbyte_sync
gcs_bucket_path: "gcs" # REQUIRED | string | Directory under the GCS bucket where data will be written. | Example: data_sync/test

Should be at 1 upper level (this wasn't clear in the commented template, hence the error and the possibility that others persons will do the same).

Here is full final conf:

# Configuration for airbyte/destination-bigquery
# Documentation about this connector can be found at https://docs.airbyte.com/integrations/destinations/bigquery
resource_name: "BigQueryFromOctavia"
definition_type: destination
definition_id: 22f6c74f-5699-40ff-833c-4a879ea40133
definition_image: airbyte/destination-bigquery
definition_version: 1.2.12

# EDIT THE CONFIGURATION BELOW!
configuration:
  dataset_id: "airbyte_octavia_thibaut" # REQUIRED | string | The default BigQuery Dataset ID that tables are replicated to if the source does not specify a namespace. Read more <a href="https://cloud.google.com/bigquery/docs/datasets#create-dataset">here</a>.
  project_id: "data-airbyte-poc" # REQUIRED | string | The GCP project ID for the project containing the target BigQuery dataset. Read more <a href="https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects">here</a>.
  loading_method:
    ## -------- Pick one valid structure among the examples below: --------
    # method: "Standard" # REQUIRED | string
    ## -------- Another valid structure for loading_method: --------
    method: "GCS Staging" # REQUIRED | string}
    credential:
      ## -------- Pick one valid structure among the examples below: --------
      credential_type: "HMAC_KEY" # REQUIRED | string
      hmac_key_secret: ${AIRBYTE_BQ1_HMAC_KEY_SECRET} # SECRET (please store in environment variables) | REQUIRED | string | The corresponding secret for the access ID. It is a 40-character base-64 encoded string. | Example: 1234567890abcdefghij1234567890ABCDEFGHIJ
      hmac_key_access_id: ${AIRBYTE_BQ1_HMAC_KEY_ACCESS_ID} # SECRET (please store in environment variables) | REQUIRED | string | HMAC key access ID. When linked to a service account, this ID is 61 characters long; when linked to a user account, it is 24 characters long. | Example: 1234567890abcdefghij1234
    gcs_bucket_name: "airbyte-octavia-thibaut-gcs" # REQUIRED | string | The name of the GCS bucket. Read more <a href="https://cloud.google.com/storage/docs/naming-buckets">here</a>. | Example: airbyte_sync
    gcs_bucket_path: "gcs" # REQUIRED | string | Directory under the GCS bucket where data will be written. | Example: data_sync/test
    # keep_files_in_gcs-bucket: "Delete all tmp files from GCS" # OPTIONAL | string | This upload method is supposed to temporary store records in GCS bucket. By this select you can chose if these records should be removed from GCS when migration has finished. The default "Delete all tmp files from GCS" value is used if not set explicitly.
  credentials_json: ${AIRBYTE_BQ1_CREDENTIALS_JSON} # SECRET (please store in environment variables) | OPTIONAL | string | The contents of the JSON service account key. Check out the <a href="https://docs.airbyte.com/integrations/destinations/bigquery#service-account-key">docs</a> if you need help generating this key. Default credentials will be used if this field is left empty.
  dataset_location: "europe-west1" # REQUIRED | string | The location of the dataset. Warning: Changes made after creation will not be applied. Read more <a href="https://cloud.google.com/bigquery/docs/locations">here</a>.
  transformation_priority: "interactive" # OPTIONAL | string | Interactive run type means that the query is executed as soon as possible, and these queries count towards concurrent rate limit and daily limit. Read more about interactive run type <a href="https://cloud.google.com/bigquery/docs/running-queries#queries">here</a>. Batch queries are queued and started as soon as idle resources are available in the BigQuery shared resource pool, which usually occurs within a few minutes. Batch queries don’t count towards your concurrent rate limit. Read more about batch queries <a href="https://cloud.google.com/bigquery/docs/running-queries#batch">here</a>. The default "interactive" value is used if not set explicitly.
  big_query_client_buffer_size_mb: 15 # OPTIONAL | integer | Google BigQuery client's chunk (buffer) size (MIN=1, MAX = 15) for each table. The size that will be written by a single RPC. Written data will be buffered and only flushed upon reaching this size or closing the channel. The default 15MB value is used if not set explicitly. Read more <a href="https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html">here</a>. | Example: 15