Back to destination list
Official
ClickHouse
This destination plugin lets you sync data from a CloudQuery source to ClickHouse database
Price
Free
Overview #
ClickHouse destination plugin
This destination plugin lets you sync data from a CloudQuery source to ClickHouse database.
It supports
append
write mode only.
Write mode selection is required through write_mode
.Supported database versions: >=
22.1.2
Configuration #
Example #
kind: destination
spec:
name: "clickhouse"
path: "cloudquery/clickhouse"
registry: "cloudquery"
version: "v6.0.3"
write_mode: "append"
# Learn more about the configuration options at https://cql.ink/clickhouse_destination
spec:
connection_string: "clickhouse://${CH_USER}:${CH_PASSWORD}@localhost:9000/${CH_DATABASE}"
# Optional parameters
# cluster: ""
# ca_cert: ""
# engine:
# name: MergeTree
# parameters: []
#
# batch_size: 10000
# batch_size_bytes: 5242880 # 5 MiB
# batch_timeout: 20s
This example configures a ClickHouse instance, located at
localhost:9000
.
It expects CH_USER
, CH_PASSWORD
and CH_DATABASE
environment variables to be set.
The (top level) spec section is described in the Destination Spec Reference.ClickHouse spec #
This is the (nested) spec used by the ClickHouse destination plugin.
connection_string
(string
) (required)Connection string to connect to the database. See SDK documentation for more details.Example connection string:"clickhouse://username:password@host1:9000,host2:9000/database?dial_timeout=200ms&max_execution_time=60"
cluster
(string
) (optional) (default: not used)Cluster name to be used for distributed DDL. If the value is empty, DDL operations will affect only the server the plugin is connected to.ca_cert
(string
) (optional) (default: not used)PEM-encoded certificate authorities. When set, a certificate pool will be created by appending the certificates to the system pool.See file variable substitution for how to read this value from a file.- Engine to be used for tables. Only
*MergeTree
family is supported at the moment. batch_size
(integer
) (optional) (default:10000
)Maximum number of items that may be grouped together to be written in a single write.batch_size_bytes
(integer
) (optional) (default:5242880
(= 5 MiB))Maximum size of items that may be grouped together to be written in a single write.batch_timeout
(duration
) (optional) (default:20s
)Maximum interval between batch writes.partition
(optional, partitioning) (default: no partitioning)Partitioning strategy to be used for tables (i.e.PARTITION BY
clause inCREATE TABLE
statements).order
(optional, ordering) (default: use existing primary key)Ordering strategy to be used for tables (i.e.ORDER BY
clause inCREATE TABLE
statements).
ClickHouse table engine
This option allows to specify a custom table engine to be used.
name
(string
) (required)Name of the table engine. Only*MergeTree
family is supported at the moment.parameters
(array of parameters) (optional) (default: empty)Engine parameters. Currently, no restrictions are imposed on the parameter types.
kind: destination
spec:
name: "clickhouse"
path: "cloudquery/clickhouse"
registry: "cloudquery"
version: "v6.0.3"
write_mode: "append"
spec:
connection_string: "clickhouse://${CH_USER}:${CH_PASSWORD}@localhost:9000/${CH_DATABASE}"
engine:
name: ReplicatedMergeTree
parameters:
- "/clickhouse/tables/{shard}/{database}/{table}"
- "{replica}"
Partitioning
This option allows to specify a partitioning strategy to be used for tables. It is an array of objects.
Each object has the following fields:
tables
(array of strings) (optional) (default:["*"]
)List of glob patterns to match table names against. Follows the same rules as the top-level spectables
option.If a table matches both a pattern intables
andskip_tables
, the table will be skipped.Partition strategy table patterns should be disjointed sets: if a table matches two partition strategies, an error will be raised at runtime.skip_tables
(array of strings) (optional) (default: empty)List of glob patterns to skip matching table names against. Follows the same rules as the top-level specskip_tables
option.If a table matches both a pattern intables
andskip_tables
, the table will be skipped.Partition strategy table patterns should be disjointed sets: if a table matches two partition strategies, an error will be raised at runtime.partition_by
(string) (required)Partitioning strategy to use, e.g.toYYYYMM(_cq_sync_time)
, the string is passed as is after "PARTITION BY" clause with no validation or quoting.An unsetpartition_by
is not valid.
Example:
partition:
- tables: ["*"]
skip_tables: ["special_partition_table", "non_partitioned_table"]
partition_by: "toYYYYMM(_cq_sync_time)"
- tables: ["special_partition_table"]
partition_by: "toYYYYMMDD(_cq_sync_time)"
Ordering
This option allows to specify custom
ORDER BY
clauses for tables or groups of tables. It is an array of objects.Each object has the following fields:
tables
(array of strings) (optional) (default:["*"]
)List of glob patterns to match table names against. Follows the same rules as the top-level spectables
option.If a table matches both a pattern intables
andskip_tables
, the table will be skipped.Ordering strategy table patterns should be disjointed sets: if a table matches two ordering strategies, an error will be raised at runtime.skip_tables
(array of strings) (optional) (default: empty)List of glob patterns to skip matching table names against. Follows the same rules as the top-level specskip_tables
option.If a table matches both a pattern intables
andskip_tables
, the table will be skipped.Ordering strategy table patterns should be disjointed sets: if a table matches two ordering strategies, an error will be raised at runtime.order_by
(array of strings) (required)Sort key to use, the strings are passed as is after "ORDER BY" clause with no validation or quoting.
Example:
order:
- tables: ["aws_ec2_instances"]
order_by:
- "`account_id`"
- "`region`"
- "toYYYYMM(`_cq_sync_time`) DESC"
- "`_cq_id`"
Connecting to ClickHouse Cloud #
To connect to ClickHouse Cloud, you need to set the
secure=true
parameter, username is default
, and the port is 9440
. Use a connection string similar to:connection_string: "clickhouse://default:${CH_PASSWORD}@<your-server-id>.<region>.<provider>.clickhouse.cloud:9440/${CH_DATABASE}?secure=true"
See Quick Start: Using the ClickHouse Client for more details.
Verbose logging for debug
The ClickHouse destination can be run in debug mode.
To achieve this pass the
debug=true
option to connection_string
.
See SDK documentation for more details.Note: This will use SDK built-in logging
and might output data and sensitive information to logs.
Make sure not to use it in production environment.
kind: destination
spec:
name: "clickhouse"
path: "cloudquery/clickhouse"
registry: "cloudquery"
version: "v6.0.3"
write_mode: "append"
spec:
connection_string: "clickhouse://${CH_USER}:${CH_PASSWORD}@localhost:9000/${CH_DATABASE}?debug=true"
Types #
Apache Arrow type conversion #
The ClickHouse destination plugin supports most of Apache Arrow types.
It uses the same approach as documented
in ClickHouse reference.
The following table shows the supported types and how they are mapped
to ClickHouse data types.
Arrow Column Type | ClickHouse Type |
---|---|
Binary | String |
Binary View | String |
Boolean | Bool |
Date32 | Date32 |
Date64 | DateTime |
Decimal128 (Decimal) | Decimal |
Decimal256 | Decimal |
Fixed Size Binary | FixedString |
Fixed Size List | Array |
Float16 | Float32 |
Float32 | Float32 |
Float64 | Float64 |
Int8 | Int8 |
Int16 | Int16 |
Int32 | Int32 |
Int64 | Int64 |
Large Binary | String |
Large List | Array |
Large String | String |
List | Array |
Map | Map |
String | String |
String View | String |
Struct | Tuple |
Time32 | DateTime64 |
Time64 | DateTime64 |
Timestamp | DateTime64 |
UUID (CloudQuery extension) | UUID |
Uint8 | UInt8 |
Uint16 | UInt16 |
Uint32 | UInt32 |
Uint64 | UInt64 |
Licenses #
The following tools / packages are used in this plugin:
Name | License |
---|---|
github.com/ClickHouse/ch-go | Apache-2.0 |
github.com/ClickHouse/clickhouse-go/v2 | Apache-2.0 |
github.com/adrg/xdg | MIT |
github.com/andybalholm/brotli | MIT |
github.com/apache/arrow/go/v13 | Apache-2.0 |
github.com/apache/arrow-go/v18 | Apache-2.0 |
github.com/apapsch/go-jsonmerge/v2 | MIT |
github.com/aws/aws-sdk-go-v2 | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/config | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/credentials | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/feature/ec2/imds | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/internal/configsources | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/internal/ini | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/internal/sync/singleflight | BSD-3-Clause |
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/service/licensemanager | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/service/marketplacemetering | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/service/sso | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/service/ssooidc | Apache-2.0 |
github.com/aws/aws-sdk-go-v2/service/sts | Apache-2.0 |
github.com/aws/smithy-go | Apache-2.0 |
github.com/aws/smithy-go/internal/sync/singleflight | BSD-3-Clause |
github.com/bahlo/generic-list-go | BSD-3-Clause |
github.com/buger/jsonparser | MIT |
github.com/cenkalti/backoff/v4 | MIT |
github.com/cloudquery/cloudquery-api-go | MPL-2.0 |
github.com/cloudquery/plugin-pb-go | MPL-2.0 |
github.com/cloudquery/plugin-sdk/v2/internal/glob | MIT |
github.com/cloudquery/plugin-sdk/v2/schema | MIT |
github.com/cloudquery/plugin-sdk/v2/types | MPL-2.0 |
github.com/cloudquery/plugin-sdk/v4 | MPL-2.0 |
github.com/cloudquery/plugin-sdk/v4/glob | MIT |
github.com/cloudquery/plugin-sdk/v4/scalar | MIT |
github.com/davecgh/go-spew/spew | ISC |
github.com/ghodss/yaml | MIT |
github.com/go-faster/city | MIT |
github.com/go-faster/errors | BSD-3-Clause |
github.com/go-logr/logr | Apache-2.0 |
github.com/go-logr/stdr | Apache-2.0 |
github.com/goccy/go-json | MIT |
github.com/google/flatbuffers/go | Apache-2.0 |
github.com/google/uuid | BSD-3-Clause |
github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors | Apache-2.0 |
github.com/grpc-ecosystem/grpc-gateway/v2 | BSD-3-Clause |
github.com/hashicorp/go-cleanhttp | MPL-2.0 |
github.com/hashicorp/go-retryablehttp | MPL-2.0 |
github.com/huandu/xstrings | MIT |
github.com/invopop/jsonschema | MIT |
github.com/klauspost/compress | Apache-2.0 |
github.com/klauspost/compress/internal/snapref | BSD-3-Clause |
github.com/klauspost/compress/zstd/internal/xxhash | MIT |
github.com/mailru/easyjson | MIT |
github.com/mattn/go-colorable | MIT |
github.com/mattn/go-isatty | MIT |
github.com/oapi-codegen/runtime | Apache-2.0 |
github.com/paulmach/orb | MIT |
github.com/pierrec/lz4/v4 | BSD-3-Clause |
github.com/pkg/errors | BSD-2-Clause |
github.com/pmezard/go-difflib/difflib | BSD-3-Clause |
github.com/rs/zerolog | MIT |
github.com/santhosh-tekuri/jsonschema/v6 | Apache-2.0 |
github.com/segmentio/asm | MIT |
github.com/shopspring/decimal | MIT |
github.com/spf13/cobra | Apache-2.0 |
github.com/spf13/pflag | BSD-3-Clause |
github.com/stretchr/testify | MIT |
github.com/thoas/go-funk | MIT |
github.com/wk8/go-ordered-map/v2 | Apache-2.0 |
github.com/zeebo/xxh3 | BSD-2-Clause |
go.opentelemetry.io/otel | Apache-2.0 |
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp | Apache-2.0 |
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp | Apache-2.0 |
go.opentelemetry.io/otel/exporters/otlp/otlptrace | Apache-2.0 |
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp | Apache-2.0 |
go.opentelemetry.io/otel/log | Apache-2.0 |
go.opentelemetry.io/otel/metric | Apache-2.0 |
go.opentelemetry.io/otel/sdk | Apache-2.0 |
go.opentelemetry.io/otel/sdk/log | Apache-2.0 |
go.opentelemetry.io/otel/sdk/metric | Apache-2.0 |
go.opentelemetry.io/otel/trace | Apache-2.0 |
go.opentelemetry.io/proto/otlp | Apache-2.0 |
golang.org/x/exp | BSD-3-Clause |
golang.org/x/net | BSD-3-Clause |
golang.org/x/sync/errgroup | BSD-3-Clause |
golang.org/x/sys | BSD-3-Clause |
golang.org/x/text | BSD-3-Clause |
golang.org/x/xerrors | BSD-3-Clause |
google.golang.org/genproto/googleapis/api/httpbody | Apache-2.0 |
google.golang.org/genproto/googleapis/rpc/status | Apache-2.0 |
google.golang.org/grpc | Apache-2.0 |
google.golang.org/protobuf | BSD-3-Clause |
gopkg.in/yaml.v2 | Apache-2.0 |
gopkg.in/yaml.v3 | MIT |