New
Join our webinar! Building a customizable and extensible cloud asset inventory at scale
Back to source list
postgresql
Official
Premium

PostgreSQL

The CloudQuery PostgreSQL plugin syncs your PostgreSQL database to any of the supported CloudQuery destinations

Publisher

cloudquery

Latest version

v6.10.4

Type

Source

Platforms
Date Published

Price per 1M rows

Starting from $17

monthly free quota

1M rows

Overview #

The CloudQuery PostgreSQL plugin syncs your PostgreSQL database to any of the supported CloudQuery destinations (e.g. PostgreSQL, BigQuery, Snowflake, and more). This plugin also supports CDC via PostgreSQL logical replication, which enables keeping your PostgreSQL up to date with any destination by subscribing to changes.

Authentication #

In order to fetch information from PostgreSQL, cloudquery needs to be authenticated. Credentials can be inserted in the connection_string field.

Configuration #

This example configures a PostgreSQL source, located at localhost:5432. The (top level) spec section is described in the Source Spec Reference.
kind: source
spec:
  name: "postgresql"
  registry: "cloudquery"
  path: "cloudquery/postgresql"
  version: "v6.10.4"
  tables: ["*"]
  destinations: ["postgresql"]
  # Learn more about the configuration options at https://cql.ink/postgresql_source
  spec:
    connection_string: "${POSTGRESQL_CONNECTION_STRING}" # set the environment variable in a format like postgres://postgres:pass@localhost:5432/postgres?sslmode=disable
    # Optional parameters:
    # cdc_id: "postgresql" # Set to a unique string per source to enable Change Data Capture mode (logical replication, or CDC)
    # pgx_log_level: error
    # rows_per_record: 500

PostgreSQL Spec #

This is the (nested) spec used by the PostgreSQL source Plugin.
  • connection_string (string) (required)
    Connection string to connect to the database. This can be a URL or a DSN, as per pgxpool
    • "postgres://jack:secret@localhost:5432/mydb?sslmode=prefer" connect with tcp and prefer TLS
    • "postgres://jack:secret@localhost:5432/mydb?sslmode=disable&application_name=pgxtest&search_path=myschema&connect_timeout=5" be explicit with all options
    • "postgres://localhost:5432/mydb?sslmode=disable" connect with os username cloudquery is being run as
    • "postgres:///mydb?host=/tmp" connect over unix socket
    • "dbname=mydb" unix domain socket, just specifying the db name - useful if you want to use peer authentication
    • "user=jack password=jack\\'ssooper\\\\secret host=localhost port=5432 dbname=mydb sslmode=disable" DSN with escaped backslash and single quote
  • pgx_log_level (string) (optional) (default: error)
    Available: error, warn, info, debug, trace. Defines what pgx call events should be logged.
  • cdc_id (string) (optional)
    If set to a non-empty string the source plugin will start syncing CDC via PostgreSQL logical replication in real-time. The value should be unique across all sources.
  • rows_per_record (integer) (optional) (default: 500)
    Amount of rows to be packed into a single Apache Arrow record to be sent over the wire during sync (or initial sync in the CDC mode).
  • destination_table_name (string) (optional) (default: {{TABLE}})
    The destination table name to write the data to. (Preview Feature)
    Supports the following placeholder variables:
    • {{TABLE}} will be replaced with the table name
    • {{UUID}} will be replaced with a random UUID in raw format to uniquely identify each table (*)
    • {{YEAR}} will be replaced with the current year in YYYY format (*)
    • {{MONTH}} will be replaced with the current month in MM format (*)
    • {{DAY}} will be replaced with the current day in DD format (*)
    • {{HOUR}} will be replaced with the current hour in HH format (*)
    • {{MINUTE}} will be replaced with the current minute in mm format (*)
    At least one of {{TABLE}} or {{UUID}} is required. Dynamic placeholders (denoted with *) are not allowed in CDC mode (when cdc_id is set)
    Note that timestamps are in UTC and will be the time sync started.


CDC #

PostgreSQL CDC (Logical Replication) Configuration

Overview #

Change Data Capture (CDC) is a set of software design patterns used to determine and track the data that has changed so that action can be taken using the changed data.
In PostgreSQL CDC can be implemented using Logical Replication.
Logical Replication is a feature of PostgreSQL that allows you to stream changes from a database to another database, file or custom handler. It is also used to keep a copy of a database up to date with the original database in an efficient manner (Internally it is implemented by the so called Write-Ahead Log).
In this document we won't go into details of how Logical Replication works internally, but we will show you how to enable it in number of environments and how to configure CloudQuery PostgreSQL source plugin that can stream the changes to any of CloudQuery supported destinations.
The PostgreSQL source plugin streams changes directly to any CloudQuery destination without the need for additional infrastructure (e.g. Kafka, RabbitMQ, etc). This means the setup is much easier.

PSQL Test #

To test that the current PostgreSQL instance supports Logical Replication we can run the following command in PSQL:
SHOW wal_level;
or the following SQL query:
SELECT setting FROM pg_settings WHERE name='wal_level'
The default is replication but for CDC to work we need to set it to logical (This can only be done on the database startup).

Docker Setup #

If you are running it locally for testing purposes you can use the following command to docker to enable logical replication:
docker run -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres:11 -c "wal_level=logical"

Self-hosted #

Change the wal_level to logical in the postgresql.conf file and restart the database.

RDS #

You will need to create a custom RDS parameter group, associate it with your RDS instance, set rds.logical_replication to 1 and restart the database.
See full resolution on AWS documentation

AWS Aurora serverless (V2) #

AWS Aurora serverless V1 doesn't support Logical Replication, but V2 does.
Similar to RDS, you will need to create a custom parameter group, associate it with your Aurora serverless instance, set rds.logical_replication to 1 and restart the database.
See full resolution on AWS documentation

GCP Cloud SQL (PostgreSQL) #

You will need to set the cloudsql.logical_deconging configuration to on. See full documentation on GCP documentation

Azure DB (PostgreSQL) #

Please follow the Official Azure documentation on how to enable logical replication via the CLI or Console (UI).

Current Limitations #

The following event types are not supported:
  • DDL changes (e.g. CREATE TABLE, ALTER TABLE, etc)
  • TRUNCATE statements (e.g. TRUNCATE TABLE table_name)
  • DELETE statements (e.g. DELETE FROM table_name)


Licenses #

The following tools / packages are used in this plugin:
NameLicense
github.com/adrg/xdgMIT
github.com/apache/arrow/go/v13Apache-2.0
github.com/apache/arrow-go/v18Apache-2.0
github.com/apapsch/go-jsonmerge/v2MIT
github.com/aws/aws-sdk-go-v2Apache-2.0
github.com/aws/aws-sdk-go-v2/configApache-2.0
github.com/aws/aws-sdk-go-v2/credentialsApache-2.0
github.com/aws/aws-sdk-go-v2/feature/ec2/imdsApache-2.0
github.com/aws/aws-sdk-go-v2/internal/configsourcesApache-2.0
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/iniApache-2.0
github.com/aws/aws-sdk-go-v2/internal/sync/singleflightBSD-3-Clause
github.com/aws/aws-sdk-go-v2/service/internal/accept-encodingApache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/presigned-urlApache-2.0
github.com/aws/aws-sdk-go-v2/service/licensemanagerApache-2.0
github.com/aws/aws-sdk-go-v2/service/marketplacemeteringApache-2.0
github.com/aws/aws-sdk-go-v2/service/ssoApache-2.0
github.com/aws/aws-sdk-go-v2/service/ssooidcApache-2.0
github.com/aws/aws-sdk-go-v2/service/stsApache-2.0
github.com/aws/smithy-goApache-2.0
github.com/aws/smithy-go/internal/sync/singleflightBSD-3-Clause
github.com/bahlo/generic-list-goBSD-3-Clause
github.com/buger/jsonparserMIT
github.com/cenkalti/backoff/v4MIT
github.com/cloudquery/cloudquery-api-goMPL-2.0
github.com/cloudquery/plugin-pb-goMPL-2.0
github.com/cloudquery/plugin-sdk/v2/internal/globMIT
github.com/cloudquery/plugin-sdk/v2/schemaMIT
github.com/cloudquery/plugin-sdk/v2/typesMPL-2.0
github.com/cloudquery/plugin-sdk/v4MPL-2.0
github.com/cloudquery/plugin-sdk/v4/globMIT
github.com/cloudquery/plugin-sdk/v4/scalarMIT
github.com/davecgh/go-spew/spewISC
github.com/ghodss/yamlMIT
github.com/go-logr/logrApache-2.0
github.com/go-logr/stdrApache-2.0
github.com/goccy/go-jsonMIT
github.com/google/flatbuffers/goApache-2.0
github.com/google/uuidBSD-3-Clause
github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptorsApache-2.0
github.com/grpc-ecosystem/grpc-gateway/v2BSD-3-Clause
github.com/hashicorp/go-cleanhttpMPL-2.0
github.com/hashicorp/go-retryablehttpMPL-2.0
github.com/invopop/jsonschemaMIT
github.com/jackc/pgioMIT
github.com/jackc/pglogreplMIT
github.com/jackc/pgpassfileMIT
github.com/jackc/pgservicefileMIT
github.com/jackc/pgx-zerologMIT
github.com/jackc/pgx/v5MIT
github.com/jackc/puddle/v2MIT
github.com/klauspost/compressApache-2.0
github.com/klauspost/compress/internal/snaprefBSD-3-Clause
github.com/klauspost/compress/zstd/internal/xxhashMIT
github.com/mailru/easyjsonMIT
github.com/mattn/go-colorableMIT
github.com/mattn/go-isattyMIT
github.com/oapi-codegen/runtimeApache-2.0
github.com/pierrec/lz4/v4BSD-3-Clause
github.com/pmezard/go-difflib/difflibBSD-3-Clause
github.com/rs/zerologMIT
github.com/santhosh-tekuri/jsonschema/v6Apache-2.0
github.com/spf13/cobraApache-2.0
github.com/spf13/pflagBSD-3-Clause
github.com/stretchr/testifyMIT
github.com/thoas/go-funkMIT
github.com/wk8/go-ordered-map/v2Apache-2.0
github.com/zeebo/xxh3BSD-2-Clause
go.opentelemetry.io/otelApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttpApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttpApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptraceApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttpApache-2.0
go.opentelemetry.io/otel/logApache-2.0
go.opentelemetry.io/otel/metricApache-2.0
go.opentelemetry.io/otel/sdkApache-2.0
go.opentelemetry.io/otel/sdk/logApache-2.0
go.opentelemetry.io/otel/sdk/metricApache-2.0
go.opentelemetry.io/otel/traceApache-2.0
go.opentelemetry.io/proto/otlpApache-2.0
golang.org/x/crypto/pbkdf2BSD-3-Clause
golang.org/x/expBSD-3-Clause
golang.org/x/netBSD-3-Clause
golang.org/x/syncBSD-3-Clause
golang.org/x/sysBSD-3-Clause
golang.org/x/textBSD-3-Clause
golang.org/x/xerrorsBSD-3-Clause
google.golang.org/genproto/googleapis/api/httpbodyApache-2.0
google.golang.org/genproto/googleapis/rpc/statusApache-2.0
google.golang.org/grpcApache-2.0
google.golang.org/protobufBSD-3-Clause
gopkg.in/yaml.v2Apache-2.0
gopkg.in/yaml.v3MIT


Join our mailing list

Subscribe to our newsletter to make sure you don't miss any updates.

Legal

© 2024 CloudQuery, Inc. All rights reserved.

We use tracking cookies to understand how you use the product and help us improve it. Please accept cookies to help us improve. You can always opt out later via the link in the footer.