BigQuery Meltano loader

The target-bigquery Meltano loader sends data into BigQuery after it was pulled from a source using an extractor.

Repository: https://github.com/adswerve/target-bigquery

Maintainer: Adswerve
Meltano Stats:

Alternative variants #

Multiple variants of target-bigquery are available. This document describes the default adswerve variant, which is recommended for new users.

Alternative variants are:

transferwise

Getting Started #

Prerequisites #

If you haven't already, follow the initial steps of the Getting Started guide:

Installation and configuration #

Using the Command Line Interface #

Add the target-bigquery loader to your project using meltano add :
```
meltano add loader target-bigquery
```
Configure the settings below using meltano config .

Next steps #

Follow the remaining steps of the Getting Started guide:

Run a data integration (EL) pipeline

If you run into any issues, learn how to get help.

Capabilities #

Settings #

target-bigquery requires the configuration of the following settings:

project_id
dataset_id
location
credentials_path

These and other supported settings are documented below. To quickly find the setting you're looking for, use the Table of Contents at the top of the page.

Project ID (`project_id`) #

Environment variable: TARGET_BIGQUERY_PROJECT_ID

BigQuery project

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set project_id <project_id>

export TARGET_BIGQUERY_PROJECT_ID=<project_id>

Dataset ID (`dataset_id`) #

Environment variable: TARGET_BIGQUERY_DATASET_ID
Default: $MELTANO_EXTRACT__LOAD_SCHEMA

BigQuery dataset.

The default value will expand to the value of the load_schema extra for the extractor used in the pipeline, which defaults to the extractor’s namespace, e.g. tap_gitlab for tap-gitlab.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set dataset_id <dataset_id>

export TARGET_BIGQUERY_DATASET_ID=<dataset_id>

Location (`location`) #

Environment variable: TARGET_BIGQUERY_LOCATION
Default: US

Dataset Location. See https://cloud.google.com/bigquery/docs/locations.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set location <location>

export TARGET_BIGQUERY_LOCATION=<location>

Credentials Path (`credentials_path`) #

Environment variable: TARGET_BIGQUERY_CREDENTIALS_PATH
Default: $MELTANO_PROJECT_ROOT/client_secrets.json

Fully qualified path to client_secrets.json for your service account.

See the “Activate the Google BigQuery API” section of the repository’s README and https://cloud.google.com/docs/authentication/production.

By default, this file is expected to be at the root of your project directory.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set credentials_path <credentials_path>

export TARGET_BIGQUERY_CREDENTIALS_PATH=<credentials_path>

Validate Records (`validate_records`) #

Environment variable: TARGET_BIGQUERY_VALIDATE_RECORDS
Default: false

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set validate_records true

export TARGET_BIGQUERY_VALIDATE_RECORDS=true

Add Metadata Columns (`add_metadata_columns`) #

Environment variable: TARGET_BIGQUERY_ADD_METADATA_COLUMNS
Default: false

Add _time_extracted and _time_loaded metadata columns

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set add_metadata_columns true

export TARGET_BIGQUERY_ADD_METADATA_COLUMNS=true

Replication Method (`replication_method`) #

Environment variable: TARGET_BIGQUERY_REPLICATION_METHOD
Options: append truncate
Default: append

Replication method, append or truncate

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set replication_method append

export TARGET_BIGQUERY_REPLICATION_METHOD=append

Table Prefix (`table_prefix`) #

Environment variable: TARGET_BIGQUERY_TABLE_PREFIX

Add prefix to table name

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set table_prefix <table_prefix>

export TARGET_BIGQUERY_TABLE_PREFIX=<table_prefix>

Table Suffix (`table_suffix`) #

Environment variable: TARGET_BIGQUERY_TABLE_SUFFIX

Add suffix to table name

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set table_suffix <table_suffix>

export TARGET_BIGQUERY_TABLE_SUFFIX=<table_suffix>

Max Cache (`max_cache`) #

Environment variable: TARGET_BIGQUERY_MAX_CACHE
Default: 50

Maximum cache size in MB

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set max_cache <max_cache>

export TARGET_BIGQUERY_MAX_CACHE=<max_cache>

Merge State Messages (`merge_state_messages`) #

Environment variable: TARGET_BIGQUERY_MERGE_STATE_MESSAGES
Default: false

Whether to merge multiple state messages from the tap into the state file or uses the last state message as the state file. Note that it is not recommended to set this to true when using with Meltano as the merge behavior conflicts with Meltano’s merge process.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set merge_state_messages true

export TARGET_BIGQUERY_MERGE_STATE_MESSAGES=true

Table Config (`table_config`) #

Environment variable: TARGET_BIGQUERY_TABLE_CONFIG

A path to a file containing the definition of partitioning and clustering.

How to use #

Manage this setting using meltano config or an environment variable:

meltano config target-bigquery set table_config <table_config>

export TARGET_BIGQUERY_TABLE_CONFIG=<table_config>

Looking for help? #

If you're having trouble getting the target-bigquery loader to work, look for an existing issue in its repository, file a new issue, or join the Meltano Slack community and ask for help in the #plugins-general channel.

Found an issue on this page? #

This page is generated from a YAML file that you can contribute changes to. Edit it on GitHub!

Alternative variants #

Getting Started #

Prerequisites #

Installation and configuration #

Using the Command Line Interface #

Next steps #

Capabilities #

Settings #

Project ID (project_id) #

How to use #

Dataset ID (dataset_id) #

How to use #

Location (location) #

How to use #

Credentials Path (credentials_path) #

How to use #

Validate Records (validate_records) #

How to use #

Add Metadata Columns (add_metadata_columns) #

How to use #

Replication Method (replication_method) #

How to use #

Table Prefix (table_prefix) #

How to use #

Table Suffix (table_suffix) #

How to use #

Max Cache (max_cache) #

How to use #

Merge State Messages (merge_state_messages) #

How to use #

Table Config (table_config) #

How to use #

Looking for help? #

Found an issue on this page? #

Project ID (`project_id`) #

Dataset ID (`dataset_id`) #

Location (`location`) #

Credentials Path (`credentials_path`) #

Validate Records (`validate_records`) #

Add Metadata Columns (`add_metadata_columns`) #

Replication Method (`replication_method`) #

Table Prefix (`table_prefix`) #

Table Suffix (`table_suffix`) #

Max Cache (`max_cache`) #

Merge State Messages (`merge_state_messages`) #

Table Config (`table_config`) #