Back to source list
Official
Premium
GCP
The GCP Source plugin for CloudQuery extracts configuration from a variety of GCP APIs and loads it into any supported CloudQuery destination
Publisher
cloudquery
Latest version
v17.2.0
Type
Source
Platforms
Date Published
Price per 1M rows
Starting from $15
monthly free quota
1M rows
Set up process #
brew install cloudquery/tap/cloudquery
1. Download CLI and login
2. Create source and destination configs
Plugin configurationOverview #
The GCP Source plugin for CloudQuery extracts configuration from a variety of GCP APIs and loads it into any supported CloudQuery destination (e.g. PostgreSQL, BigQuery, Snowflake, and more).
Libraries in Use #
Authentication #
The GCP plugin authenticates using your Application Default Credentials. Available options are all the same options described here in detail:
Local Environment:
gcloud auth application-default login
(recommended when running locally)
Google Cloud cloud-based development environment:
- When you run on Cloud Shell or Cloud Code credentials are already available.
Google Cloud containerized environment:
- When running on GKE use workload identity.
- Services such as Compute Engine, App Engine and functions supporting attaching a user-managed service account which will CloudQuery will be able to utilize.
On-premises or another cloud provider
- The suggested way is to use Workload identity federation
- If not available you can always use service account keys and export the location of the key via
GOOGLE_APPLICATION_CREDENTIALS
. Highly not recommended as long-lived keys are a security risk
Query Examples: #
Find all buckets without uniform bucket-level access #
select project_id, name from gcp_storage_buckets where uniform_bucket_level_access->>'Enabled' = 'true';
Configuration #
GCP Source Plugin Configuration Reference
Example #
This example connects a single GCP project to a Postgres destination. The (top level) source spec section is described in the Source Spec Reference.
kind: source
spec:
# Source spec section
name: "gcp"
path: "cloudquery/gcp"
registry: "cloudquery"
version: "v17.2.0"
tables: ["gcp_storage_buckets"]
destinations: ["postgresql"]
# GCP Spec
# Learn more about the configuration options at https://cql.ink/gcp_source
spec:
project_ids: ["my-project"]
GCP Spec #
This is the (nested) spec used by GCP Source Plugin
project_ids
([]string
) (default: empty. will use all projects available to the current authenticated account)Specify projects to connect to. If eitherfolder_ids
orproject_filter
is specified, these projects will be synced in addition to the projects from the folder/filter.service_account_key_json
(string
) (default: empty)GCP service account key content.Using service accounts is not recommended, but if it is used it is better to use environment or file variable substitution.folder_ids
([]string
) (default: empty)CloudQuery will sync from all the projects in the specified folders, recursively.folder_ids
must be of the formatfolders/<folder_id>
ororganizations/<organization_id>
. This feature requires theresourcemanager.folders.list
permission.By default, CloudQuery will alsosync
from sub-folders recursively (up to depth100
). To reduce this, setfolder_recursion_depth
to a lower value (or to0
to disable recursion completely).Mutually exclusive withproject_filter
.If you specify*
then all folders in all organizations will be synced.folder_recursion_depth
(integer
) (default:100
)The maximum depth to recurse into sub-folders.0
means no recursion (only the top-level projects in folders will be used for sync).project_filter
(string
) (default: empty)A filter to determine the projects that are synced, mutually exclusive withfolder_ids
.For instance, to only sync projects where the name starts withhow-
, setproject_filter
toname:how-*
.More examples:"name:how-* OR name:test-*"
matches projects starting withhow-
ortest-
"NOT name:test-*"
matches all projects not starting withtest-
organization_ids
([]string
) (default: empty. will use all organizations available to the current authenticated account)Specify organizations to use when syncing organization level resources (e.g. folders or security findings).Iforganization_filter
is specified, these organizations will be used in addition to the organizations from the filter.organization_filter
(string
) (default: empty)A filter to determine the organizations to use when syncing organization level resources (e.g. folders or security findings).For instance, to use only organizations from thecloudquery.io
domain, setorganization_filter
todomain:cloudquery.io
.For syntax and example queries refer to API Reference here.backoff_retries
(integer
) (default:5
)Maximum number of retries to make when rate limited.backoff_delay
(integer
) (default:30
)Maximum delay in seconds between retries when rate limited.enabled_services_only
(boolean
) (default:false
)If enabled CloudQuery will skip any resources that belong to a service that has been disabled or not been enabled.If you use this option on a large organization (with more than500
projects) you should also set thebackoff_retries
to a value greater than0
, otherwise you may hit the API rate limits.In>=v9.0.0
if an error is returned then CloudQuery will assume that all services are enabled and will continue to attempt to sync all specified tables rather than just ending the sync.concurrency
(integer
) (default:50000
)The best effort maximum number of Go routines to use. Lower this number to reduce memory usage.discovery_concurrency
(integer
) (default:100
)The number of concurrent requests that CloudQuery will make to resolve enabled services. This is only used whenenabled_services_only
is set totrue
.scheduler
(string
) (default:round-robin
)The scheduler to use when determining the priority of resources to sync. Supported values aredfs
(depth-first search),round-robin
,shuffle
andshuffle-queue
.For more information about this, see performance tuning.service_account_impersonation
(Service Account Impersonation spec, optional. Default: empty)Service Account impersonation configuration.table_options
(map
) (default: not used)Table options is a premium feature. Even if some tables are free, syncing data for them (& their relations) using table options counts towards paid usage.Please refer to the Table Options documentation for more information.
Service Account Impersonation Spec #
target_principal
(string
) (required)The email address of the service account to impersonate.scopes
([]string
) (default:["https://www.googleapis.com/auth/cloud-platform"]
)Scopes that the impersonated credential should have.See available scopes in the documentation.delegates
([]string
) (default: empty)Delegates are the service account email addresses in a delegation chain. Each service account must be grantedroles/iam.serviceAccountTokenCreator
on the next service account in the chain.subject
(string
) (default: empty)The subject field of a JWT (sub
). This field should only be set if you wish to impersonate a user. This feature is useful when using domain wide delegation.
GCP + Kubernetes (GKE) #
kind: source
spec:
name: gcp
path: "cloudquery/gcp"
registry: cloudquery
version: "v17.2.0"
tables: ["gcp_container_clusters"]
destinations: ["<destination>"]
---
kind: source
spec:
name: k8s
path: "cloudquery/k8s"
registry: cloudquery
version: "v7.2.2"
tables: ["*"]
destinations: ["<destination>"]
Kubernetes users may see the following message when running the K8s plugin on GKE Clusters:
WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.26+; use gcloud instead.
As part of an initiative to remove platform specific code from Kubernetes, authentication will begin to be delegated to authentication plugins, starting in version 1.26.
What does this mean for CloudQuery users? #
CloudQuery does not use any specific resources which hinder the upgrade.
Install #
The easiest way to upgrade, is to install
gke-gcloud-auth-plugin
from gcloud components
on Mac or Windows:gcloud components install gke-gcloud-auth-plugin
and apt on Deb based systems:
sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin
Verify #
Mac or Linux:
gke-gcloud-auth-plugin --version
Windows:
gke-gcloud-auth-plugin.exe --version
Switch authentication methods #
Set the flag:
export USE_GKE_GCLOUD_AUTH_PLUGIN=True
Update components:
gcloud components update
Force credential update:
gcloud container clusters get-credentials {$CLUSTER_NAME}
Now you should be able to use
kubectl
as normal, and you
should no longer see the warning in the CloudQuery output.For more information, read Google's press release.
Table options #
This feature enables users to override the default options for specific tables.
The root of the object takes a table name, and the next level takes an API method name.
The final level is the actual input object as defined by the API.
The format of the
table_options
object is as follows:table_options:
<table_name>:
<api_method_name>:
- <input_object>
A list of
<input_object>
objects should be provided.
The plugin will iterate through these to make multiple API calls.
This is useful for APIs like the Compute AggregatedListInstances
method that only supports a single filter per call. For example:table_options:
gcp_compute_instances:
aggregated_list_instances:
- include_all_scopes: true
filter: '(cpuPlatform = "Intel Skylake") AND (scheduling.automaticRestart = true)'
- include_all_scopes: false
filter: '(cpuPlatform = "Intel Broadwell") AND (scheduling.automaticRestart = true)'
The following tables and APIs are supported:
table_options:
gcp_compute_instances:
aggregated_list_instances:
- <Compute.AggregatedListInstancesRequest> # PageToken, MaxResults and Project are prohibited
The full list of supported options are documented under the
Table Options
section of each table in the GCP plugin tables documentation.