BigQuerySource
BigQuerySources hold the connection information to Google BigQuery instances.
BigQuery Authentication
You can authenticate BigQuery in one of two ways:
1. Pass a base64 encoded service account key to the credentials_base64
field.
2. Set the absolute file path to the credentials file in a environment variable named GOOGLE_APPLICATION_CREDENTIALS
.
Using encoding the service key json credential file to base64 can be a useful way to authenticate BigQuery without logging into Google Cloud Console each time and makes it easier to manage credentials in CI/CD pipelines.
However utilizing base64 encoding requires a few extra steps:
- Create a Google Cloud Service Account
- Go to the Google Cloud Console
- Select your project
- Navigate to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Give it a name and description
- Grant it the "BigQuery Admin" role (or more restrictive custom role)
- Click "Done"
- Create and download credentials
- Find your service account in the list
- Click the three dots menu > "Manage keys"
- Click "Add Key" > "Create new key"
- Choose JSON format
- Click "Create" - this downloads your credentials file
- Convert credentials to base64
# On Linux/Mac python -m base64 < credentials.json > encoded.txt # On Windows PowerShell [Convert]::ToBase64String([System.IO.File]::ReadAllBytes("credentials.json")) > encoded.txt
- Use the contents of encoded.txt as your credentials_base64 value. You can store the single line key in your untracked env file and use the
{{ env_var('VAR_NAME') }}
syntax to reference the environment variable in your Visivo config.
If you use gcloud locally you probably have this environment variable configured already.
Run echo $GOOGLE_APPLICATION_CREDENTIALS
in your terminal. If it returns your crendetials then
you're all set. and can configure a BigQuerySource without the credentials_base64
field.
If you don't have the environment variable, follow these steps:
- Create a Google Cloud Service Account
- Go to the Google Cloud Console
- Select your project
- Navigate to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Give it a name and description
- Grant it the "BigQuery Admin" role (or more restrictive custom role)
- Click "Done"
- Create and download credentials
- Find your service account in the list
- Click the three dots menu > "Manage keys"
- Click "Add Key" > "Create new key"
- Choose JSON format
- Click "Create" - this downloads your credentials file
- Set the environment variable
You can set the environment variable in your shell profile file.
or in your untracked .env file.
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/credentials.json"
This method is easier to manage and does not require any extra steps to authenticate.GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials.json
The service account needs at minimum the "BigQuery User" role to execute queries. For more restricted access, you can create a custom role with just the required permissions:
- bigquery.jobs.create
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.list
Example
sources:
- name: bigquery_source
type: bigquery
project: my-project-id
database: my_dataset
credentials_base64: {{ env_var('BIGQUERY_BASE64_ENCODED_CREDENTIALS') }}
Note: Recommended environment variable use is covered in the sources overview.
Attributes
Field | Type | Default | Description |
---|---|---|---|
path | string | None | A unique path to this object |
name | string | None | The unique name of the object across the entire project. |
host | string | None | The host url of the database. |
port | integer | None | The port of the database. |
database | string | None | The default BigQuery dataset to use for queries. |
username | string | None | Username for the database. |
password | string | None | Password corresponding to the username. |
db_schema | string | None | The schema that the Visivo project will use in queries. |
project | string | None | The Google Cloud project ID that contains your BigQuery dataset. |
credentials_base64 | string | None | The Google Cloud service account credentials JSON string base64 encoded. Turn your JSON into a base64 string in the command line with python -m base64 < credentials.json > encoded.txt . Not required if GOOGLE_APPLICATION_CREDENTIALS environment variable is set. |
type | string | None | |
connection_pool_size | integer | 8 | The pool size that is used for this connection. |