Report an issue
Back to plugin list
googleanalytics
Official
Premium

Google Analytics

The CloudQuery Google Analytics plugin allows you to run custom Google Analytics Data API v1 reports from Google Analytics and load it into any supported CloudQuery destination

Publisher

cloudquery

Latest version

v4.4.2

Type

Source

Platforms
Date Published

Price per 1M rows

Starting from $15

monthly free quota

1M rows

Set up process #


brew install cloudquery/tap/cloudquery

1. Download CLI and login

See installation options

2. Create source and destination configs

Plugin configuration

cloudquery sync googleanalytics.yml postgresql.yml

3. Run the sync

CloudQuery sync

Overview #

The CloudQuery Google Analytics plugin allows you to run custom Google Analytics Data API v1 reports from Google Analytics and load it into any supported CloudQuery destination (e.g. PostgreSQL, BigQuery, Snowflake, and more).

Authentication #

Two methods are supported: OAuth 2.0 and Application Default Credentials.

OAuth 2.0 #

The following options are available when using OAuth:
  • Using an existing access token
    This token should be authorized for https://www.googleapis.com/auth/analytics.readonly scope (e.g. by using OAuth 2.0 Playground).
  • Using OAuth client ID & client secret
    You can get your own OAuth credentials using this guide.

Application Default Credentials #

Note: You will still need to authorize these credentials for https://www.googleapis.com/auth/analytics.readonly scope.
Available options are all the same options described here in detail.
Local Environment
See this guide for local environment to get you started.
The final step is to run:
gcloud auth application-default login \
  --scopes=https://www.googleapis.com/auth/analytics.readonly \
  --client-id-file=[PATH/TO/credentials.json]
Google Cloud cloud-based development environment
When you run on Cloud Shell or Cloud Code credentials are already available.
Google Cloud containerized environment
When running on GKE use workload identity.
Google Cloud services that support attaching a service account
Services such as Compute Engine, App Engine and functions supporting attaching a user-managed service account which will CloudQuery will be able to utilize. You can find out more here.
On-premises or another cloud provider
The suggested way is to use Workload identity federation. If not available, you can use service account keys and export the location of the key via GOOGLE_APPLICATION_CREDENTIALS. This is not recommended as long-lived keys present a security risk.
kind: source
# Common source-plugin configuration
spec:
  name: googleanalytics
  path: cloudquery/googleanalytics
  registry: cloudquery
  version: "v4.4.2"
  tables: ["*"]
  destinations: ["postgresql"]
  backend_options:
    table_name: "cq_state_googleanalytics"
    connection: "@@plugins.postgresql.connection"

  # Google Analytics specific configuration
  # Learn more about the configuration options at https://cql.ink/googleanalytics_source
  spec:
    property_id: "<YOUR_PROPERTY_ID_HERE>"
    oauth:
      access_token: "<YOUR_OAUTH_ACCESS_TOKEN>"
    reports:
      - name: example
        dimensions:
          - date
          - language
          - country
          - city
          - browser
          - operatingSystem
          - year
          - month
          - hour
        metrics:
          - name: totalUsers
          - name: new_users
            expression: newUsers
          - name: new_users2
            expression: "newUsers + totalUsers"
            invisible: true
        keep_empty_rows: true

Configuration #

This is the (nested) spec used by the Google Analytics source plugin:
  • property_id (string) (required)
    A Google Analytics GA4 property identifier whose events are tracked. To learn more, see where to find your Property ID.
    Supported formats:
    • A plain property ID (example: 1234)
    • Prefixed with properties/ (example: properties/1234)
  • reports ([]report) (required)
    Reports to be fetched from Google Analytics.
  • start_date (string) (optional) (default: date 7 days prior to the sync start)
    A date in YYYY-MM-DD format (example: 2023-05-15). If not specified, the start date will be the one that is 7 days prior to the sync start date.
  • oauth (OAuth spec) (optional)
    OAuth spec for authorization in Google Analytics.
  • concurrency (integer) (optional) (default: 10000)
    The best effort maximum number of Go routines to use. Lower this number to reduce memory usage.

Google Analytics OAuth spec #

OAuth spec to authenticate with Google Analytics. Google Analytics Data API v1 requires OAuth authorization for https://www.googleapis.com/auth/analytics.readonly scope to run reports.
  • access_token (string) (optional)
    An access token that you generated authorizing for https://www.googleapis.com/auth/analytics.readonly scope (e.g., by using OAuth 2.0 Playground).
  • client_id (string) (optional)
    OAuth 2.0 Client ID. Required if access_token is empty.
  • client_secret (string) (optional)
    OAuth 2.0 Client secret. Required if access_token is empty.

Google Analytics Report spec #

Report specification will be transformed into a Google Analytics Data API v1 report. The option structure follows:
  • name (string) (required)
    Name of the report. It will be translated into a table name as ga_ prefix followed by report name in snake case.
  • dimensions ([]string) (optional)
    A list of Google Analytics Data API v1 dimensions. At most 9 dimensions can be specified per report.
  • metrics ([]metric) (required)
    A list of Google Analytics Data API v1 metrics. Expressions are supported, too.
  • keep_empty_rows (boolean) (optional)
    Whether empty rows should be captured, too.
Google Analytics metric spec
Metric spec that is based on Google Analytics Data API v1 Metric parameter.
  • name (string) (required)
    A name or alias (if expression is specified) of the requested metric.
  • expression (string) (optional)
    A mathematical expression for derived metrics.
  • invisible (boolean) (optional)
    Indicates if a metric is invisible in the report response. This allows creating more complex requests, while also not saving the intermediate results.