New
Join our webinar! Building a customizable and extensible cloud asset inventory at scale
Report an issue
Back to destination list
azblob
Official

Azure Blob Storage

This destination plugin lets you sync data from a CloudQuery source to remote Azure Blob Storage storage in various formats such as CSV, JSON and Parquet

Publisher

cloudquery

Repositorygithub.com
Latest version

v4.4.3

Type

Destination

Platforms
Date Published

Price

Free

Overview #

Azure Blob Storage Destination Plugin

This destination plugin lets you sync data from a CloudQuery source to remote Azure Blob Storage storage in various formats such as CSV, JSON and Parquet.

Authentication #

The plugin needs to be authenticated with your Azure account in order to fetch information about your cloud setup.
You can either authenticate with az login (when running locally), or by using a "service principal" and exporting environment variables (appropriate for automated deployments).
You can find out more about authentication with Azure at Azure's documentation for the Go SDK.

Example #

This example configures an Azure blob storage destination, to create CSV files in https://cqdestinationazblob.blob.core.windows.net/test/path/to/files.
The (top level) spec section is described in the Destination Spec Reference.
kind: destination
spec:
  name: "azblob"
  path: "cloudquery/azblob"
  registry: "cloudquery"
  version: "v4.4.3"
  spec:
    storage_account: "cqdestinationazblob"
    container: "test"
    path: "path/to/files"

    format: "csv" # options: parquet, json, csv
    format_spec:
      # CSV specific parameters:
      # delimiter: ","
      # skip_header: false
      # Parquet specific parameters:
      # version: "v2Latest"
      # root_repetition: "repeated"
      # max_row_group_length: 134217728 # 128 * 1024 * 1024

    # Optional parameters
    # compression: "" # options: gzip
    # no_rotate: false
    # batch_size: 10000
    # batch_size_bytes: 52428800 # 50 MiB
    # batch_timeout: 30s
The Azure Blob destination utilizes batching, and supports batch_size, batch_size_bytes and batch_timeout options (see below).

Azure Blob Spec #

This is the (nested) spec used by the Azure blob destination Plugin.
  • storage_account (string) (required)
    Storage account where to sync the files.
  • container (string) (required)
    Storage container inside the storage account where to sync the files.
  • path (string) (required)
    Path to where the files will be uploaded in the above bucket.
  • no_rotate (boolean) (optional) (default: false)
    If set to true, the plugin will write to one file per table. Otherwise, for every batch a new file will be created with a different .<UUID> suffix.
  • format (string) (required)
    Format of the output file. Supported values are csv, json and parquet.
  • format_spec (format_spec) (optional)
    Optional parameters to change the format of the file.
  • compression (string) (optional) (default: empty)
    Compression algorithm to use. Supported values are empty or gzip. Not supported for parquet format.
  • batch_size (integer) (optional) (default: 10000)
    Number of records to write before starting a new object.
  • batch_size_bytes (integer) (optional) (default: 52428800 (50 MiB))
    Number of bytes (as Arrow buffer size) to write before starting a new object.
  • batch_timeout (duration) (optional) (default: 30s (30 seconds))
    Maximum interval between batch writes.

format_spec #

CSV
  • delimiter (string) (optional) (default: ,)
    Delimiter to use in the CSV file.
  • skip_header (boolean) (optional) (default: false)
    If set to true, the CSV file will not contain a header row as the first row.
JSON
Reserved for future use.
Parquet
  • version (string) (optional) (default: v2Latest)
    Parquet format version to use. Supported values are v1.0, v2.4, v2.6 and v2Latest. v2Latest is an alias for the latest version available in the Parquet library which is currently v2.6.
    Useful when the reader consuming the Parquet files does not support the latest version.
  • root_repetition (string) (optional) (default: repeated)
    Repetition option to use for the root node. Supported values are undefined, required, optional and repeated.
    Some Parquet readers require a specific root repetition option to be able to read the file. For example, importing Parquet files into Snowflake requires the root repetition to be undefined.
  • max_row_group_length (integer) (optional) (default: 134217728 (= 128 * 1024 * 1024))
    The maximum number of rows in a single row group. Use a lower number to reduce memory usage when reading the Parquet files, and a higher number to increase the efficiency of reading the Parquet files.


Licenses #

The following tools / packages are used in this plugin:
NameLicense
github.com/Azure/azure-sdk-for-go/sdk/azcoreMIT
github.com/Azure/azure-sdk-for-go/sdk/azidentityMIT
github.com/Azure/azure-sdk-for-go/sdk/internalMIT
github.com/Azure/azure-sdk-for-go/sdk/storage/azblobMIT
github.com/AzureAD/microsoft-authentication-library-for-go/appsMIT
github.com/JohnCGriffin/overflowMIT
github.com/adrg/xdgMIT
github.com/andybalholm/brotliMIT
github.com/apache/arrow/go/v13Apache-2.0
github.com/apache/arrow-go/v18Apache-2.0
github.com/apache/thrift/lib/go/thriftApache-2.0
github.com/apapsch/go-jsonmerge/v2MIT
github.com/aws/aws-sdk-go-v2Apache-2.0
github.com/aws/aws-sdk-go-v2/configApache-2.0
github.com/aws/aws-sdk-go-v2/credentialsApache-2.0
github.com/aws/aws-sdk-go-v2/feature/ec2/imdsApache-2.0
github.com/aws/aws-sdk-go-v2/internal/configsourcesApache-2.0
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/iniApache-2.0
github.com/aws/aws-sdk-go-v2/internal/sync/singleflightBSD-3-Clause
github.com/aws/aws-sdk-go-v2/service/internal/accept-encodingApache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/presigned-urlApache-2.0
github.com/aws/aws-sdk-go-v2/service/licensemanagerApache-2.0
github.com/aws/aws-sdk-go-v2/service/marketplacemeteringApache-2.0
github.com/aws/aws-sdk-go-v2/service/ssoApache-2.0
github.com/aws/aws-sdk-go-v2/service/ssooidcApache-2.0
github.com/aws/aws-sdk-go-v2/service/stsApache-2.0
github.com/aws/smithy-goApache-2.0
github.com/aws/smithy-go/internal/sync/singleflightBSD-3-Clause
github.com/bahlo/generic-list-goBSD-3-Clause
github.com/buger/jsonparserMIT
github.com/cenkalti/backoff/v4MIT
github.com/cloudquery/cloudquery-api-goMPL-2.0
github.com/cloudquery/codegen/jsonschemaMPL-2.0
github.com/cloudquery/plugin-pb-goMPL-2.0
github.com/cloudquery/plugin-sdk/v2/internal/globMIT
github.com/cloudquery/plugin-sdk/v2/schemaMIT
github.com/cloudquery/plugin-sdk/v2/typesMPL-2.0
github.com/cloudquery/plugin-sdk/v4MPL-2.0
github.com/cloudquery/plugin-sdk/v4/globMIT
github.com/cloudquery/plugin-sdk/v4/scalarMIT
github.com/davecgh/go-spew/spewISC
github.com/ghodss/yamlMIT
github.com/go-logr/logrApache-2.0
github.com/go-logr/stdrApache-2.0
github.com/goccy/go-jsonMIT
github.com/golang-jwt/jwt/v5MIT
github.com/golang/snappyBSD-3-Clause
github.com/google/flatbuffers/goApache-2.0
github.com/google/uuidBSD-3-Clause
github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptorsApache-2.0
github.com/grpc-ecosystem/grpc-gateway/v2BSD-3-Clause
github.com/hashicorp/go-cleanhttpMPL-2.0
github.com/hashicorp/go-retryablehttpMPL-2.0
github.com/huandu/xstringsMIT
github.com/invopop/jsonschemaMIT
github.com/klauspost/compressApache-2.0
github.com/klauspost/compress/internal/snaprefBSD-3-Clause
github.com/klauspost/compress/zstd/internal/xxhashMIT
github.com/klauspost/cpuid/v2MIT
github.com/kylelemons/godebugApache-2.0
github.com/mailru/easyjsonMIT
github.com/mattn/go-colorableMIT
github.com/mattn/go-isattyMIT
github.com/oapi-codegen/runtimeApache-2.0
github.com/pierrec/lz4/v4BSD-3-Clause
github.com/pkg/browserBSD-2-Clause
github.com/pmezard/go-difflib/difflibBSD-3-Clause
github.com/rs/zerologMIT
github.com/santhosh-tekuri/jsonschema/v6Apache-2.0
github.com/spf13/cobraApache-2.0
github.com/spf13/pflagBSD-3-Clause
github.com/stretchr/testifyMIT
github.com/thoas/go-funkMIT
github.com/wk8/go-ordered-map/v2Apache-2.0
github.com/zeebo/xxh3BSD-2-Clause
go.opentelemetry.io/otelApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttpApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttpApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptraceApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttpApache-2.0
go.opentelemetry.io/otel/logApache-2.0
go.opentelemetry.io/otel/metricApache-2.0
go.opentelemetry.io/otel/sdkApache-2.0
go.opentelemetry.io/otel/sdk/logApache-2.0
go.opentelemetry.io/otel/sdk/metricApache-2.0
go.opentelemetry.io/otel/traceApache-2.0
go.opentelemetry.io/proto/otlpApache-2.0
golang.org/x/crypto/pkcs12BSD-3-Clause
golang.org/x/expBSD-3-Clause
golang.org/x/netBSD-3-Clause
golang.org/x/sync/errgroupBSD-3-Clause
golang.org/x/sysBSD-3-Clause
golang.org/x/textBSD-3-Clause
golang.org/x/xerrorsBSD-3-Clause
google.golang.org/genproto/googleapis/api/httpbodyApache-2.0
google.golang.org/genproto/googleapis/rpc/statusApache-2.0
google.golang.org/grpcApache-2.0
google.golang.org/protobufBSD-3-Clause
gopkg.in/yaml.v2Apache-2.0
gopkg.in/yaml.v3MIT


Join our mailing list

Subscribe to our newsletter to make sure you don't miss any updates.

Legal

© 2024 CloudQuery, Inc. All rights reserved.

We use tracking cookies to understand how you use the product and help us improve it. Please accept cookies to help us improve. You can always opt out later via the link in the footer.