Back to plugin list
gremlin
Official

Gremlin

This destination plugin lets you sync data from any CloudQuery source to a Gremlin compatible graph database such as AWS Neptune

Publisher

cloudquery

Repositorygithub.com
Latest version

v2.5.13

Type

Destination

Platforms
Date Published

Price

Free

Overview #

Gremlin Destination Plugin

This destination plugin lets you sync data from any CloudQuery source to a Gremlin compatible graph database such as AWS Neptune.
Supported database (tested) versions (We use the official Go driver):
  • Gremlin Server >= 3.6.2
  • AWS Neptune >= 1.2
As a side note graph databases can be quite useful for various networking use-cases, visualization, for read-teams, blue-teams and more.

Configuration #

Example #

This example configures a Gremlin destination, located at ws://localhost:8182. The username and password are stored in environment variables.
kind: destination
spec:
  name: "gremlin"
  path: "cloudquery/gremlin"
  registry: "cloudquery"
  version: "v2.5.13"
  spec:
    endpoint: "ws://localhost:8182"
    # Optional parameters
    # auth_mode: none
    # username: ""
    # password: ""
    # aws_region: ""
    # aws_neptune_host: ""
    # max_retries: 5
    # max_concurrent_connections: 5 # default: number of CPUs
    # batch_size: 200
    # batch_size_bytes: 4194304 # 4 MiB
The (top level) spec section is described in the Destination Spec Reference.
The Gremlin destination utilizes batching, and supports batch_size and batch_size_bytes.

Connecting to AWS Neptune #

For AWS Neptune, you don't need to specify any credentials if IAM authentication is not enabled. Keep auth_mode at none.
If IAM authentication is enabled, you need to set auth_mode to aws and aws_region to the region of the database. The plugin will use the default AWS credentials chain to authenticate.

Plugin Spec #

This is the (nested) spec used by the Gremlin destination Plugin.
  • endpoint (string) (required)
    Endpoint for the database. Supported schemes are wss:// and ws://, the default port is 8182.
    • "localhost" (defaults to wss://localhost:8182)
    • "ws://localhost:8182"
    • "wss://your-endpoint.cluster-id.your-region.neptune.amazonaws.com"
  • insecure (boolean) (optional)
    Whether to skip TLS verification. Defaults to false. This should be set on a macOS environment when connecting to an AWS Neptune endpoint.
  • auth_mode (string) (optional) (default: none)
    Authentication mode to use. basic uses static credentials, aws uses AWS IAM authentication. Supported values are none, basic or aws.
  • username (string) (optional)
    Username to connect to the database.
  • password (string) (optional)
    Password to connect to the database.
  • aws_region (string) (required when auth_mode is aws)
    AWS region to use for AWS IAM authentication. Example: us-east-1.
  • aws_neptune_host (string) (optional, used when auth_mode is aws)
    AWS Neptune host header to use with AWS IAM authentication. Use if you're not accessing Neptune directly, when auth_mode is aws. Example: my-neptune.cluster.us-east-1.neptune.amazonaws.com
  • max_retries (integer) (optional) (default: 5)
    Number of retries on ConcurrentModificationException before giving up for each batch. Retries are exponentially backed off.
  • max_concurrent_connections (integer) (optional) (default: number of CPUs)
    Maximum number of concurrent connections to the database.
  • complete_types (boolean) (optional) (default: false)
    Whether to use all Gremlin-supported types or just a basic set. Should remain false for Amazon Neptune compatibility.
  • batch_size (integer) (optional) (default: 200)
    Number of records to batch together before sending to the database.
  • batch_size_bytes (integer) (optional) (default: 4194304 (4 MiB))
    Number of bytes (as Arrow buffer size) to batch together before sending to the database.


Types #

Gremlin Types

The Gremlin destination (v2.0.0 and later) supports most Apache Arrow types. The following table shows the supported types and how they are mapped to Gremlin data types.
Arrow Column TypeSupported?Gremlin Type
Binary✅ YesBytes
Boolean✅ YesBoolean
Date32✅ YesString
Date64✅ YesString
Decimal✅ YesString
Dense Union✅ YesString
Dictionary✅ YesString
Duration[ms]✅ YesString
Duration[ns]✅ YesString
Duration[s]✅ YesString
Duration[us]✅ YesString
Fixed Size List✅ YesString
Float16✅ YesString
Float32✅ YesFloat
Float64✅ YesFloat
Inet✅ YesString
Int8✅ YesInteger
Int16✅ YesInteger
Int32✅ YesInteger
Int64✅ YesInteger
Interval[DayTime]✅ YesString
Interval[MonthDayNano]✅ YesString
Interval[Month]✅ YesString
JSON✅ YesString
Large Binary✅ YesBytes
Large List✅ YesString
Large String✅ YesString
List✅ YesString or List
MAC✅ YesString
Map✅ YesString
String✅ YesString
Struct✅ YesString
Timestamp[ms]✅ YesString *
Timestamp[ns]✅ YesString
Timestamp[s]✅ YesString
Timestamp[us]✅ YesString
UUID✅ YesString
Uint8✅ YesString
Uint16✅ YesInteger
Uint32✅ YesInteger
Uint64✅ YesInteger
Union✅ YesString
String-persisted data types are encoded according to the Arrow String Representation specification.

Notes #

* Timestamps are converted to strings in the format yyyy-MM-dd HH:mm:ss.SSSSSSSSS (UTC timezone) (e.g. 2021-01-01 00:00:00.000000000). _cq_sync_time column is persisted in native Timestamp type.
† List types are persisted as-is only if complete_types option is enabled. Otherwise, they are converted to strings.
NUL bytes are stripped from strings.