CloudQuery

Back to source list
file
Official
Premium

File

The CloudQuery File plugin syncs parquet files to any of the supported CloudQuery destinations

Publisher

cloudquery

Latest version

v1.6.16

Type

Source

Platforms

Date Published

Overview #

The CloudQuery File plugin reads parquet files and loads it into any supported CloudQuery destination (e.g. PostgreSQL, BigQuery, Snowflake, and more).
kind: source
spec:
  name: file
  path: cloudquery/file
  registry: cloudquery
  version: "v1.1.1"
  tables: ["*"]
  destinations: ["postgresql"]

  spec:
    files_dir: "/path/to/files-to-sync" # required. Path to the directory with files to sync
    # concurrency: 50 # optional. Number of files to sync in parallel. Default: 50

File spec #

This is the (nested) spec used by the File source plugin.
  • files_dir (string) (required)
    Path to the directory with files to sync. Only files with .parquet extension will be synced.
  • concurrency (integer) (optional) (default: 50)
    Number of files to sync in parallel. Negative values mean no limit.

Example with AWS Cost and Usage Reports #

AWS Cost and Usage Reports are stored in S3 as parquet files. The following example shows how to sync these files and AWS infrastructure data to a PostgreSQL database. To learn more about visualizing AWS Cost and Usage Reports, visit our dashboards page.
kind: source
spec:
  name: file
  path: cloudquery/file
  registry: cloudquery
  version: "v1.1.1"
  destinations: [postgresql]
  tables: ["*"]
  spec:
    files_dir: "/path/to/cost_and_usage_reports" # Update this value to the local directory with your AWS Cost and Usage Reports
---
kind: source
spec:
  name: aws
  path: cloudquery/aws
  registry: cloudquery
  version: "v32.19.0"
  destinations: [postgresql]
  tables: ["*"]
  skip_tables:
    - aws_ec2_vpc_endpoint_services 
    - aws_cloudtrail_events
    - aws_docdb_cluster_parameter_groups
    - aws_docdb_engine_versions
    - aws_ec2_instance_types
    - aws_elasticache_engine_versions
    - aws_elasticache_parameter_groups
    - aws_elasticache_reserved_cache_nodes_offerings
    - aws_elasticache_service_updates
    - aws_iam_group_last_accessed_details
    - aws_iam_policy_last_accessed_details
    - aws_iam_role_last_accessed_details
    - aws_iam_user_last_accessed_details
    - aws_neptune_cluster_parameter_groups
    - aws_neptune_db_parameter_groups
    - aws_rds_cluster_parameter_groups
    - aws_rds_db_parameter_groups
    - aws_rds_engine_versions
    - aws_servicequotas_services
---
kind: destination
spec:
  name: postgresql
  path: cloudquery/postgresql
  registry: cloudquery
  version: "v8.8.7"
  spec:
    connection_string: postgresql://postgres:pass@localhost:5432/postgres



© 2025 CloudQuery, Inc. All rights reserved.