BigQuerySource
BigQuerySources hold the connection information to Google BigQuery instances.
BigQuery Authentication
You can authenticate BigQuery in one of two ways:
1. Pass a base64 encoded service account key to the credentials_base64 field.
2. Set the absolute file path to the credentials file in a environment variable named GOOGLE_APPLICATION_CREDENTIALS.
Using encoding the service key json credential file to base64 can be a useful way to authenticate BigQuery without logging into Google Cloud Console each time and makes it easier to manage credentials in CI/CD pipelines.
However utilizing base64 encoding requires a few extra steps:
- Create a Google Cloud Service Account
- Go to the Google Cloud Console
- Select your project
- Navigate to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Give it a name and description
- Grant it the "BigQuery Admin" role (or more restrictive custom role)
- Click "Done"
- Create and download credentials
- Find your service account in the list
- Click the three dots menu > "Manage keys"
- Click "Add Key" > "Create new key"
- Choose JSON format
- Click "Create" - this downloads your credentials file
- Convert credentials to base64
# On Linux/Mac python -m base64 < credentials.json > encoded.txt # On Windows PowerShell [Convert]::ToBase64String([System.IO.File]::ReadAllBytes("credentials.json")) > encoded.txt - Use the contents of encoded.txt as your credentials_base64 value. You can store the single line key in your untracked env file and use the
{{ env_var('VAR_NAME') }}syntax to reference the environment variable in your Visivo config.
If you use gcloud locally you probably have this environment variable configured already.
Run echo $GOOGLE_APPLICATION_CREDENTIALS in your terminal. If it returns your crendetials then
you're all set. and can configure a BigQuerySource without the credentials_base64 field.
If you don't have the environment variable, follow these steps:
- Create a Google Cloud Service Account
- Go to the Google Cloud Console
- Select your project
- Navigate to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Give it a name and description
- Grant it the "BigQuery Admin" role (or more restrictive custom role)
- Click "Done"
- Create and download credentials
- Find your service account in the list
- Click the three dots menu > "Manage keys"
- Click "Add Key" > "Create new key"
- Choose JSON format
- Click "Create" - this downloads your credentials file
- Set the environment variable
You can set the environment variable in your shell profile file.
or in your untracked .env file.
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/credentials.json"This method is easier to manage and does not require any extra steps to authenticate.GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials.json
The service account needs at minimum the "BigQuery User" role to execute queries. For more restricted access, you can create a custom role with just the required permissions:
- bigquery.jobs.create
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.list
Example
sources:
- name: bigquery_source
type: bigquery
project: my-project-id
database: my_dataset
credentials_base64: {{ env_var('BIGQUERY_BASE64_ENCODED_CREDENTIALS') }}
Note: Recommended environment variable use is covered in the sources overview.
Attributes
| Field | Type | Default | Description |
|---|---|---|---|
| path | string | None | A unique path to this object |
| name | string | None | The unique name of the object across the entire project. |
| file_path | string | None | The path to the file that contains the object definition. |
| host | string | None | The host url of the database. |
| port | integer | None | The port of the database. |
| database | string | None | The default BigQuery dataset to use for queries. |
| username | string | None | Username for the database. |
| password | string | None | Password corresponding to the username. |
| db_schema | string | None | The schema that the Visivo project will use in queries. |
| after_connect | string | None | |
| project | string | None | The Google Cloud project ID that contains your BigQuery dataset. |
| credentials_base64 | string | None | The Google Cloud service account credentials JSON string base64 encoded. Turn your JSON into a base64 string in the command line with python -m base64 < credentials.json > encoded.txt. Not required if GOOGLE_APPLICATION_CREDENTIALS environment variable is set. |
| type | string | None | |
| connection_pool_size | integer | 8 | The pool size that is used for this connection. |