Back to plugin list


The CloudQuery Azure source plugin extracts information from many of the supported services by Microsoft Azure and loads it into any supported CloudQuery destination



Latest version




Date Published

Price per 1M rows

Starting from $15

monthly free quota

1M rows

Set up process #

brew install cloudquery/tap/cloudquery

1. Download CLI and login

See installation options

2. Create source and destination configs

Plugin configuration

cloudquery sync azure.yml postgresql.yml

3. Run the sync

CloudQuery sync

Overview #

The CloudQuery Azure source plugin extracts information from many of the supported services by Microsoft Azure and loads it into any supported CloudQuery destination (e.g. PostgreSQL, BigQuery, Snowflake, and more).

Authentication #

The Azure plugin uses DefaultAzureCredential to authenticate.
DefaultAzureCredential will attempt to authenticate via different mechanisms in order, stopping when one succeeds. The order is described in detail in the Azure SDK documentation.
For getting started quickly with the Azure plugin, we recommend using a service principal and exporting environment variables or using az login. The latter is highly discouraged for production use as it requires spawning a new Azure CLI process each time an authentication token is needed and causes memory and performance issues.

Authentication with Environment Variables #

You will need to create a service principal for the plugin to use:
Creating a service principal
First, install the Azure CLI (az).
Then, login with the Azure CLI:
az login
Then, create the service principal the plugin will use to access your cloud deployment. WARNING: The output of az ad sp create-for-rbac contains credentials that you must protect - Make sure to handle with appropriate care. This example uses bash - The commands for CMD and PowerShell are similar.
az account set --subscription $SUBSCRIPTION_ID
az provider register --namespace 'Microsoft.Security'

# Create a service-principal for the plugin
az ad sp create-for-rbac --name cloudquery-sp --scopes /subscriptions/$SUBSCRIPTION_ID --role Reader
You can choose any name you'd like for your service-principal, cloudquery-sp is an example. If the service principal doesn't exist it will create a new one, otherwise it will update the existing one
The output of az ad sp create-for-rbac should look like this:
  "displayName": "cloudquery-sp",
  "tenant": "YOUR AZURE_TENANT_ID"
Exporting environment variables
Next, you need to export the environment variables that the plugin will use to sync your cloud configuration. Copy them from the output of az ad sp create-for-rbac. The example shows how to export environment variables for Linux - exporting for CMD and PowerShell is similar.
  • AZURE_TENANT_ID is tenant in the JSON.
  • AZURE_CLIENT_ID is appId in the JSON.
  • AZURE_CLIENT_SECRET is password in the JSON.

Authentication with az login #

First, install the Azure CLI (az). Then, login with the Azure CLI:
az login
You are now authenticated!

Query Examples #

Find all MySQL servers #

SELECT * FROM azure_mysql_servers;

Find storage accounts that are allowing non-HTTPS traffic #

SELECT * from azure_storage_accounts where enable_https_traffic_only = false;

Find all expired key vaults #

SELECT * from azure_keyvault_vault_keys where attributes_expires >= extract(epoch from now()) * 1000;

List the Memory and vCPUs of all available Azure Compute VM types #

 vcpus.capability_value AS "vCPUs",
 memory.capability_value AS "Memory"
 azure_compute_skus vm
   SELECT (caps ->> 'value') AS capability_value
   FROM jsonb_array_elements(vm.capabilities) caps
   WHERE (caps ->> 'name') = 'vCPUs'
 ) vcpus
   SELECT (caps ->> 'value') AS capability_value
   FROM jsonb_array_elements(vm.capabilities) caps
   WHERE (caps ->> 'name') = 'MemoryGB'
 ) memory
 vm.resource_type = 'virtualMachines' order by name;
| name                      | vCPUs | Memory |
| Basic_A0                  | 1     | 0.75   |
| Basic_A1                  | 1     | 1.75   |
| Basic_A2                  | 2     | 3.5    |
| Basic_A3                  | 4     | 7      |
| Basic_A4                  | 8     | 14     |
| Standard_A0               | 1     | 0.75   |
| Standard_A1               | 1     | 1.75   |
| Standard_A1_v2            | 1     | 2      |
... (truncated)

Configuration #

CloudQuery Azure Source Plugin Configuration Reference

Example #

This example connects a single Azure subscription to a single destination. The (top level) source spec section is described in the Source Spec Reference.
kind: source
  # Source spec section
  name: "azure"
  path: "cloudquery/azure"
  registry: "cloudquery"
  version: "v14.2.0"
  destinations: ["postgresql"]
  tables: ["azure_compute_virtual_machines"]
  # Learn more about the configuration options at
    # Optional parameters
    # subscriptions: []
    # cloud_name: ""
    # concurrency: 50000
    # discovery_concurrency: 400
    # skip_subscriptions: []
    # normalize_ids: false
    # oidc_token: ""
    # retry_options:
    #   max_retries: 3
    #   try_timeout_seconds: 0
    #   retry_delay_seconds: 4
    #   max_retry_delay_seconds: 60

Azure Spec #

This is the (nested) spec used by the Azure source plugin.
  • subscriptions ([]string) (default: empty. Will use all visible subscriptions)
    Specify which subscriptions to sync data from.
  • cloud_name (string) (default: empty)
    The name of the cloud environment to use. Possible values are AzureCloud, AzureChinaCloud, AzureUSGovernment. See the Azure CLI documentation for more information.
  • concurrency (int) (default: 50000):
    The best effort maximum number of Go routines to use. Lower this number to reduce memory usage.
  • discovery_concurrency (int) (default: 400)
    During initialization the Azure source plugin discovers all resource groups and enabled resource providers per subscription, to be used later on during the sync process. The plugin runs the discovery process in parallel. This setting controls the maximum number of concurrent requests to the Azure API during discovery. Only accounts with many subscriptions should require modifying this setting, to either lower it to avoid network errors, or to increase it to speed up the discovery process.
  • skip_subscriptions ([]string) (default: empty)
    A list of subscription IDs that CloudQuery will skip syncing. This is useful if CloudQuery is discovering the list of subscription IDs and there are some subscriptions that you want to not even attempt syncing.
  • normalize_ids (bool) (default: false)
    Enabling this setting will force all id column values to be lowercase. This is useful to avoid case sensitivity and uniqueness issues around the id primary keys
  • oidc_token (string) (default: empty)
    An OIDC token can be used to authenticate with Azure instead of AZURE_CLIENT_SECRET. This is useful for Azure AD workload identity federation. When using this option, the AZURE_CLIENT_ID and AZURE_TENANT_ID environment variables must be set.
  • retry_options (RetryOptions) (default: empty)
    Retry options to pass to the Azure Go SDK, see more details here

retry_options #

  • max_retries (integer) (default: 3)
Described in the Azure Go SDK.
  • try_timeout_seconds (integer) (default: 0)
Disabled by default. Described in the Azure Go SDK.
  • retry_delay_seconds (integer) (default: 4)
Described in the Azure Go SDK.
  • max_retry_delay_seconds (integer) (default: 60)
Described in the Azure Go SDK.
  • status_codes ([]integer) (default: null)
Described in the Azure Go SDK.
The default of null uses the default status codes. An empty value disables retries for HTTP status codes.

© 2024 CloudQuery, Inc. All rights reserved.