Official

Elasticsearch destination integration documentation

The Elasticsearch plugin syncs data from any CloudQuery source plugin(s) to an Elasticsearch cluster

Publisher

cloudquery

Repository

github.com

Latest version

v3.5.26

Type

Destination

Platforms

Date Published

Download CloudQuery CLI

Documentation Changelog

Overview Types Licenses

Overview #

Elasticsearch Destination Plugin

The Elasticsearch plugin syncs data from any CloudQuery source plugin(s) to an Elasticsearch cluster.

Example config #

The following config will sync data to an Elasticsearch cluster running on localhost:9200:

kind: destination
spec:
  name: elasticsearch
  path: cloudquery/elasticsearch
  registry: cloudquery
  version: "v3.5.26"
  write_mode: "overwrite-delete-stale"
  spec:
    # Elastic Cloud configuration parameters
    cloud_id: "${ELASTICSEARCH_CLOUD_ID}"
    api_key: "${ELASTICSEARCH_API_KEY}"

    # Self-hosted Elasticsearch configuration parameters
    # addresses: ["http://localhost:9200"]
    # username: ""
    # password: ""
    # service_token: ""
    # certificate_fingerprint: ""
    # ca_cert: ""

    # Optional parameters
    # concurrency: 5 # default: number of CPUs
    # batch_size: 1000
    # batch_size_bytes: 5242880 # 5 MiB

The Elasticsearch destination utilizes batching, and supports batch_size and batch_size_bytes.

It supports append, overwrite and overwrite-delete-stale write modes. The default write mode is overwrite-delete-stale.

Elasticsearch Spec #

This is the spec used by the Elasticsearch destination plugin.

addresses ([]string) (optional) (default: ["http://localhost:9200"])
A list of Elasticsearch nodes to use. Mutually exclusive with cloud_id.
username (string) (optional)
Username for HTTP Basic Authentication.
password (string) (optional)
Password for HTTP Basic Authentication.
cloud_id (string) (optional) (example: MyDeployment:abcdefgh)
Endpoint for the Elasticsearch Service (https://elastic.co/cloud). Mutually exclusive with addresses.
api_key (string) (optional)
Base64-encoded token for authorization; if set, overrides username/password and service token.
service_token (string) (optional)
Service token for authorization; if set, overrides username/password.
certificate_fingerprint (string) (optional)
SHA256 hex fingerprint given by Elasticsearch on first launch.
ca_cert (string) (optional)
PEM-encoded certificate authorities. When set, an empty certificate pool will be created, and the certificates will be appended to it. See file variable substitution for how to read this value from a file.
concurrency (string) (optional) (default: number of CPUs)
Number of concurrent worker goroutines to use for indexing.
batch_size (integer) (optional) (default: 1000)
Maximum number of items that may be grouped together to be written in a single write.
batch_size_bytes (integer) (optional) (default: 5242880 (5 MiB))
Maximum size of items that may be grouped together to be written in a single write.

Index Template Creation #

The Elasticsearch destination will create an index template for every table during the migration step. It is recommended that you use the generated index templates, as it will automatically create indexes with the correct mappings for the table. However, to skip index template creation (or use your own), you may use the --no-migrate option when running cloudquery sync.

Index Naming #

Index names will be formatted according to the selected write mode:

append: indexes will be named using the format <table_name>-<YYYY-MM-DD>. In other words, a new index will be created every day the table is synced. Entries will never be overwritten.
overwrite: indexes will be named using the format <table_name>. Objects with duplicate primary keys will be overwritten.
overwrite-delete-stale: indexes will be named using the format <table_name>. Objects with duplicate primary keys will be overwritten, and any objects that are not present in the current sync will be deleted.

Index templates will also be created such that they match the index names generated by the selected write mode.

Querying From Kibana #

To query data from Kibana, you will need to create data views (previously also known as "index patterns"). To query a specific table, the data view's index pattern should be in the format <table_name>-*. For example, if you have a table named aws_ec2_instances, you should create a data view with index pattern named aws_ec2_instances-*. One useful feature of Elasticsearch and Kibana, however, is the ability to query across all data. To do this for the aws source plugin, for example, you may use an index pattern named aws_*. This will then allow queries across all tables synced by the aws source plugin.

Underlying library #

We use the official go-elasticsearch package. It is tested against Elasticsearch 8.6.0. Please open an issue if you encounter any problems with this (or another) version.

Types #

Elasticsearch Types

The Elasticsearch destination (v2.0.0 and later) supports most Apache Arrow types. The following table shows the supported types and how they are mapped to Elasticsearch field data types.

Arrow Column Type	Supported?	Elasticsearch Type
Binary	✅ Yes	`binary`
Boolean	✅ Yes	`boolean`
Date32	✅ Yes	`date` with format `yyyy-MM-dd`
Date64	✅ Yes	`date` with format `yyyy-MM-dd`
Decimal	✅ Yes	`text`
Dense Union	✅ Yes	`text`
Dictionary	✅ Yes	`text`
Duration[ms]	✅ Yes	`text`
Duration[ns]	✅ Yes	`text`
Duration[s]	✅ Yes	`text`
Duration[us]	✅ Yes	`text`
Fixed Size List	✅ Yes	Uses type from list elements
Float16	✅ Yes	`half_float`
Float32	✅ Yes	`float`
Float64	✅ Yes	`double`
Inet	✅ Yes	`text`
Int8	✅ Yes	`byte`
Int16	✅ Yes	`short`
Int32	✅ Yes	`integer`
Int64	✅ Yes	`long`
Interval[DayTime]	✅ Yes	`object`
Interval[MonthDayNano]	✅ Yes	`object`
Interval[Month]	✅ Yes	`object`
JSON	✅ Yes	`text`
Large Binary	✅ Yes	`byte`
Large List	✅ Yes	Uses type from list elements
Large String	✅ Yes	`text`
List	✅ Yes	Uses type from list elements
MAC	✅ Yes	`text`
Map	✅ Yes	`object` with `key` and `value` fields
String	✅ Yes	`text`
Struct	✅ Yes	`object`
Time32[s]	✅ Yes	`date` with format `HH:mm:ss`
Time32[ms]	✅ Yes	`date` with format `HH:mm:ss.SSS`
Time64[us]	✅ Yes	`text`
Time64[ns]	✅ Yes	`text`
Timestamp[s]	✅ Yes	`date` with format `2006-01-02T15:04:05Z`
Timestamp[ms]	✅ Yes	`date` with format `2006-01-02T15:04:05.999Z`
Timestamp[us]	✅ Yes	`date` with format `2006-01-02T15:04:05.999999Z"`
Timestamp[ns]	✅ Yes	`date_nanos` with format `2006-01-02T15:04:05.99999999Z`
UUID	✅ Yes	`text`
Uint8	✅ Yes	`unsigned_long`
Uint16	✅ Yes	`unsigned_long`
Uint32	✅ Yes	`unsigned_long`
Uint64	✅ Yes	`unsigned_long`
Union	✅ Yes	`text`

Licenses #

The following tools / packages are used in this plugin:

Name	License
github.com/adrg/xdg	MIT
github.com/apache/arrow/go/v13	Apache-2.0
github.com/apache/arrow-go/v18	Apache-2.0
github.com/apapsch/go-jsonmerge/v2	MIT
github.com/aws/aws-sdk-go-v2	Apache-2.0
github.com/aws/aws-sdk-go-v2/config	Apache-2.0
github.com/aws/aws-sdk-go-v2/credentials	Apache-2.0
github.com/aws/aws-sdk-go-v2/feature/ec2/imds	Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/configsources	Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2	Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/ini	Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/sync/singleflight	BSD-3-Clause
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding	Apache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url	Apache-2.0
github.com/aws/aws-sdk-go-v2/service/licensemanager	Apache-2.0
github.com/aws/aws-sdk-go-v2/service/marketplacemetering	Apache-2.0
github.com/aws/aws-sdk-go-v2/service/sso	Apache-2.0
github.com/aws/aws-sdk-go-v2/service/ssooidc	Apache-2.0
github.com/aws/aws-sdk-go-v2/service/sts	Apache-2.0
github.com/aws/smithy-go	Apache-2.0
github.com/aws/smithy-go/internal/sync/singleflight	BSD-3-Clause
github.com/bahlo/generic-list-go	BSD-3-Clause
github.com/buger/jsonparser	MIT
github.com/cenkalti/backoff/v4	MIT
github.com/cloudquery/cloudquery-api-go	MPL-2.0
github.com/cloudquery/plugin-pb-go	MPL-2.0
github.com/cloudquery/plugin-sdk/v2/internal/glob	MIT
github.com/cloudquery/plugin-sdk/v2/schema	MIT
github.com/cloudquery/plugin-sdk/v2/types	MPL-2.0
github.com/cloudquery/plugin-sdk/v4	MPL-2.0
github.com/cloudquery/plugin-sdk/v4/glob	MIT
github.com/cloudquery/plugin-sdk/v4/scalar	MIT
github.com/davecgh/go-spew/spew	ISC
github.com/elastic/elastic-transport-go/v8/elastictransport	Apache-2.0
github.com/elastic/go-elasticsearch/v8	Apache-2.0
github.com/elastic/go-elasticsearch/v8/typedapi/types	Apache-2.0
github.com/elastic/go-elasticsearch/v8/typedapi/types/enums/licensestatus	Apache-2.0
github.com/elastic/go-elasticsearch/v8/typedapi/types/enums/licensetype	Apache-2.0
github.com/ghodss/yaml	MIT
github.com/go-logr/logr	Apache-2.0
github.com/go-logr/stdr	Apache-2.0
github.com/goccy/go-json	MIT
github.com/google/flatbuffers/go	Apache-2.0
github.com/google/uuid	BSD-3-Clause
github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors	Apache-2.0
github.com/grpc-ecosystem/grpc-gateway/v2	BSD-3-Clause
github.com/hashicorp/go-cleanhttp	MPL-2.0
github.com/hashicorp/go-retryablehttp	MPL-2.0
github.com/huandu/xstrings	MIT
github.com/invopop/jsonschema	MIT
github.com/klauspost/compress	Apache-2.0
github.com/klauspost/compress/internal/snapref	BSD-3-Clause
github.com/klauspost/compress/zstd/internal/xxhash	MIT
github.com/mailru/easyjson	MIT
github.com/mattn/go-colorable	MIT
github.com/mattn/go-isatty	MIT
github.com/oapi-codegen/runtime	Apache-2.0
github.com/pierrec/lz4/v4	BSD-3-Clause
github.com/pmezard/go-difflib/difflib	BSD-3-Clause
github.com/rs/zerolog	MIT
github.com/santhosh-tekuri/jsonschema/v6	Apache-2.0
github.com/segmentio/fasthash/fnv1a	MIT
github.com/spf13/cobra	Apache-2.0
github.com/spf13/pflag	BSD-3-Clause
github.com/stretchr/testify	MIT
github.com/thoas/go-funk	MIT
github.com/wk8/go-ordered-map/v2	Apache-2.0
github.com/zeebo/xxh3	BSD-2-Clause
go.opentelemetry.io/otel	Apache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp	Apache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp	Apache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace	Apache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp	Apache-2.0
go.opentelemetry.io/otel/log	Apache-2.0
go.opentelemetry.io/otel/metric	Apache-2.0
go.opentelemetry.io/otel/sdk	Apache-2.0
go.opentelemetry.io/otel/sdk/log	Apache-2.0
go.opentelemetry.io/otel/sdk/metric	Apache-2.0
go.opentelemetry.io/otel/trace	Apache-2.0
go.opentelemetry.io/proto/otlp	Apache-2.0
golang.org/x/exp	BSD-3-Clause
golang.org/x/net	BSD-3-Clause
golang.org/x/sync/errgroup	BSD-3-Clause
golang.org/x/sys	BSD-3-Clause
golang.org/x/text	BSD-3-Clause
golang.org/x/xerrors	BSD-3-Clause
google.golang.org/genproto/googleapis/api/httpbody	Apache-2.0
google.golang.org/genproto/googleapis/rpc/status	Apache-2.0
google.golang.org/grpc	Apache-2.0
google.golang.org/protobuf	BSD-3-Clause
gopkg.in/yaml.v2	Apache-2.0
gopkg.in/yaml.v3	MIT

Loading plugin documentation

Test CloudQuery's capabilities with a demo

Elasticsearch destination integration documentation

Overview #

Elasticsearch Destination Plugin

Example config #

Elasticsearch Spec #

Index Template Creation #

Index Naming #

Querying From Kibana #

Underlying library #

Types #

Elasticsearch Types

Licenses #

Overview #

Elasticsearch Destination Plugin

Example config #

Elasticsearch Spec #

Index Template Creation #

Index Naming #

Querying From Kibana #

Underlying library #

Types #

Elasticsearch Types

Licenses #