Back to plugin list
clickhouse
Official
Premium

ClickHouse

This plugin is in preview.

Sync from ClickHouse to any destination

Publisher

cloudquery

Latest version

v1.0.14

Type

Source

Platforms
Date Published

Price

Free while in preview

Set up process #


brew install cloudquery/tap/cloudquery

1. Download CLI and login

See installation options

2. Create source and destination configs

Plugin configuration

cloudquery sync clickhouse.yml postgresql.yml

3. Run the sync

CloudQuery sync

Overview #

ClickHouse source plugin

The CloudQuery ClickHouse plugin syncs your ClickHouse database to any of the supported CloudQuery destinations (e.g. PostgreSQL, BigQuery, Snowflake, and more).
Supported database versions: >= 24.2

Configuration #

Example #

kind: source
spec:
  name: "clickhouse"
  path: "cloudquery/clickhouse"
  registry: "cloudquery"
  version: "v1.0.14"
  destinations: ["postgresql"]
  tables: ["*"]
  spec:
    connection_string: "clickhouse://${CH_USER}:${CH_PASSWORD}@localhost:9000/${CH_DATABASE}"
    # Optional parameters
    # ca_cert: ""
    # rows_per_record: 500
    # concurrency: 100
This example configures sync from a ClickHouse instance, located at localhost:9000. It expects CH_USER, CH_PASSWORD and CH_DATABASE environment variables to be set. The (top level) spec section is described in the Source Spec Reference.

ClickHouse spec #

This is the (nested) spec used by the ClickHouse source plugin.
  • connection_string (string) (required)
    Connection string to connect to the database. See SDK documentation for more details.
  • ca_cert (string) (optional) (default: not used)
    PEM-encoded certificate authorities. When set, a certificate pool will be created by appending the certificates to the system pool.
    See file variable substitution for how to read this value from a file.
  • rows_per_record (integer) (optional) (default: 500)
    Amount of rows to be packed into a single Apache Arrow record to be sent over the wire during sync.
  • concurrency (integer) (optional) (default: 100)
    Number of tables to sync concurrently. Lower or increase this number based on your database size and available resources.

Connecting to ClickHouse Cloud #

To connect to ClickHouse Cloud, you need to set the secure=true parameter, username to default, and the port to 9440. Use a connection string similar to:
connection_string: "clickhouse://default:${CH_PASSWORD}@<your-server-id>.<region>.<provider>.clickhouse.cloud:9440/${CH_DATABASE}?secure=true"
Verbose logging for debug
The ClickHouse source can be run in debug mode. To achieve this pass the debug=true option to connection_string. See SDK documentation for more details.
Note: This will use SDK built-in logging and might output data and sensitive information to logs. Make sure not to use it in production environment.
kind: source
spec:
  name: "clickhouse"
  path: "cloudquery/clickhouse"
  registry: "cloudquery"
  version: "v1.0.14"
  tables: ["*"]
  spec:
    connection_string: "clickhouse://${CH_USER}:${CH_PASSWORD}@localhost:9000/${CH_DATABASE}?debug=true"


Types #

Apache Arrow type conversion #

The ClickHouse source plugin supports most of ClickHouse data types. It uses the same approach as documented in ClickHouse reference. The following table shows the supported types and how they are mapped to Apache Arrow types.
ClickHouse TypeArrow Column Type
BoolBoolean
UInt8Uint8
UInt16Uint16
UInt32Uint32
UInt64Uint64
Int8Int8
Int16Int16
Int32Int32
Int64Int64
Float32Float32
Float64Float64
StringString
FixedStringFixed Size Binary
Date32Date32
DateTimeTimestamp
DateTime64Timestamp
DecimalDecimal128 (Decimal) or Decimal256 based on precision
ArrayList
MapMap
TupleStruct
UUIDUUID (CloudQuery extension)