Back to destination list
Official
Gremlin
This destination plugin lets you sync data from any CloudQuery source to a Gremlin compatible graph database such as AWS Neptune
Price
Free
Overview #
Gremlin Destination Plugin
This destination plugin lets you sync data from any CloudQuery source to a Gremlin compatible graph database such as AWS Neptune.
Supported database (tested) versions (We use the official Go driver):
- Gremlin Server >= 3.6.2
- AWS Neptune >= 1.2
As a side note graph databases can be quite useful for various networking use-cases, visualization, for read-teams, blue-teams and more.
Configuration #
Example #
This example configures a Gremlin destination, located at
ws://localhost:8182
. The username and password are stored in environment variables.kind: destination
spec:
name: "gremlin"
path: "cloudquery/gremlin"
registry: "cloudquery"
version: "v2.5.23"
spec:
endpoint: "ws://localhost:8182"
# Optional parameters
# auth_mode: none
# username: ""
# password: ""
# aws_region: ""
# aws_neptune_host: ""
# max_retries: 5
# max_concurrent_connections: 5 # default: number of CPUs
# batch_size: 200
# batch_size_bytes: 4194304 # 4 MiB
The (top level) spec section is described in the Destination Spec Reference.
The Gremlin destination utilizes batching, and supports
batch_size
and batch_size_bytes
.Connecting to AWS Neptune #
For AWS Neptune, you don't need to specify any credentials if IAM authentication is not enabled. Keep
auth_mode
at none
.If IAM authentication is enabled, you need to set
auth_mode
to aws
and aws_region
to the region of the database. The plugin will use the default AWS credentials chain to authenticate.Plugin Spec #
This is the (nested) spec used by the Gremlin destination Plugin.
endpoint
(string
) (required)Endpoint for the database. Supported schemes arewss://
andws://
, the default port is8182
."localhost"
(defaults towss://localhost:8182
)"ws://localhost:8182"
"wss://your-endpoint.cluster-id.your-region.neptune.amazonaws.com"
insecure
(boolean
) (optional)Whether to skip TLS verification. Defaults tofalse
. This should be set on a macOS environment when connecting to an AWS Neptune endpoint.auth_mode
(string
) (optional) (default:none
)Authentication mode to use.basic
uses static credentials,aws
uses AWS IAM authentication. Supported values arenone
,basic
oraws
.username
(string
) (optional)Username to connect to the database.password
(string
) (optional)Password to connect to the database.aws_region
(string
) (required whenauth_mode
isaws
)AWS region to use for AWS IAM authentication. Example:us-east-1
.aws_neptune_host
(string
) (optional, used whenauth_mode
isaws
)AWS Neptune host header to use with AWS IAM authentication. Use if you're not accessing Neptune directly, whenauth_mode
isaws
. Example:my-neptune.cluster.us-east-1.neptune.amazonaws.com
max_retries
(integer
) (optional) (default:5
)Number of retries onConcurrentModificationException
before giving up for each batch. Retries are exponentially backed off.max_concurrent_connections
(integer
) (optional) (default: number of CPUs)Maximum number of concurrent connections to the database.complete_types
(boolean
) (optional) (default:false
)Whether to use all Gremlin-supported types or just a basic set. Should remainfalse
for Amazon Neptune compatibility.batch_size
(integer
) (optional) (default:200
)Number of records to batch together before sending to the database.batch_size_bytes
(integer
) (optional) (default:4194304
(4 MiB))Number of bytes (as Arrow buffer size) to batch together before sending to the database.
Types #
Gremlin Types
The Gremlin destination (
v2.0.0
and later) supports most Apache Arrow types. The following table shows the supported types and how they are mapped to Gremlin data types.Arrow Column Type | Supported? | Gremlin Type |
---|---|---|
Binary | ✅ Yes | Bytes |
Boolean | ✅ Yes | Boolean |
Date32 | ✅ Yes | String |
Date64 | ✅ Yes | String |
Decimal | ✅ Yes | String |
Dense Union | ✅ Yes | String |
Dictionary | ✅ Yes | String |
Duration[ms] | ✅ Yes | String |
Duration[ns] | ✅ Yes | String |
Duration[s] | ✅ Yes | String |
Duration[us] | ✅ Yes | String |
Fixed Size List | ✅ Yes | String |
Float16 | ✅ Yes | String |
Float32 | ✅ Yes | Float |
Float64 | ✅ Yes | Float |
Inet | ✅ Yes | String |
Int8 | ✅ Yes | Integer |
Int16 | ✅ Yes | Integer |
Int32 | ✅ Yes | Integer |
Int64 | ✅ Yes | Integer |
Interval[DayTime] | ✅ Yes | String |
Interval[MonthDayNano] | ✅ Yes | String |
Interval[Month] | ✅ Yes | String |
JSON | ✅ Yes | String |
Large Binary | ✅ Yes | Bytes |
Large List | ✅ Yes | String |
Large String | ✅ Yes | String |
List | ✅ Yes | String or List † |
MAC | ✅ Yes | String |
Map | ✅ Yes | String |
String | ✅ Yes | String |
Struct | ✅ Yes | String |
Timestamp[ms] | ✅ Yes | String * |
Timestamp[ns] | ✅ Yes | String |
Timestamp[s] | ✅ Yes | String |
Timestamp[us] | ✅ Yes | String |
UUID | ✅ Yes | String |
Uint8 | ✅ Yes | String |
Uint16 | ✅ Yes | Integer |
Uint32 | ✅ Yes | Integer |
Uint64 | ✅ Yes | Integer |
Union | ✅ Yes | String |
String-persisted data types are encoded according to the Arrow String Representation specification.
Notes #
* Timestamps are converted to strings in the format
yyyy-MM-dd HH:mm:ss.SSSSSSSSS
(UTC timezone) (e.g. 2021-01-01 00:00:00.000000000
). _cq_sync_time
column is persisted in native Timestamp
type.† List types are persisted as-is only if
complete_types
option is enabled. Otherwise, they are converted to strings.NUL
bytes are stripped from strings.