CloudQuery

Report an issue
Back to source list
awscur
Official
Premium

AWS Cost Usage Reports

The CloudQuery AWS CUR (Cost Usage Reports) source plugin reads cost report parquet files and loads them into any supported CloudQuery destination

Publisher

cloudquery

Latest version

v1.0.3

Type

Source

Platforms

Date Published

Overview #

The CloudQuery AWS CUR (Cost Usage Reports) source plugin reads cost reports parquet files and loads them into any supported CloudQuery destination (e.g. PostgreSQL, BigQuery, Snowflake, and more).

Authentication #

The plugin needs to be authenticated with your account(s) in order to read from your S3 bucket.
The plugin requires s3:GetObject and s3:ListBucket permissions on the bucket and objects that you are trying to sync.
There are multiple ways to authenticate with AWS, and the plugin respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:
  • The AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN environment variables.
  • The credentials and config files in ~/.aws (the credentials file takes priority).
  • You can also use aws sso to authenticate cloudquery - you can read more about it here.
  • IAM roles for AWS compute resources (including EC2 instances, Fargate and ECS containers).
You can read more about AWS authentication here and here.

Environment Variables #

CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables (AWS_SESSION_TOKEN can be optional for some accounts). For information on obtaining credentials, see the AWS guide.
To export the environment variables (On Linux/Mac - similar for Windows):
export AWS_ACCESS_KEY_ID='{Your AWS Access Key ID}'
export AWS_SECRET_ACCESS_KEY='{Your AWS secret access key}'
export AWS_SESSION_TOKEN='{Your AWS session token}'

Shared Configuration files #

The plugin can use credentials from your credentials and config files in the .aws directory in your home folder. The contents of these files are practically interchangeable, but CloudQuery will prioritize credentials in the credentials file.
For information about obtaining credentials, see the AWS guide.
Here are example contents for a credentials file:
[default]
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
You can also specify credentials for a different profile, and instruct CloudQuery to use the credentials from this profile instead of the default one.
For example:
[myprofile]
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
Then, you can either export the AWS_PROFILE environment variable (On Linux/Mac, similar for Windows):
export AWS_PROFILE=myprofile
or, configure your desired profile in the local_profile field:
local_profile: myprofile

IAM Roles for AWS Compute Resources #

The plugin can use IAM roles for AWS compute resources (including EC2 instances, Fargate and ECS containers). If you configured your AWS compute resources with IAM, the plugin will use these roles automatically. For more information on configuring IAM, see the AWS docs here and here.

User Credentials with MFA #

In order to leverage IAM User credentials with MFA, the STS "get-session-token" command may be used with the IAM User's long-term security credentials (Access Key and Secret Access Key). For more information, see here.
aws sts get-session-token --serial-number <YOUR_MFA_SERIAL_NUMBER> --token-code <YOUR_MFA_TOKEN_CODE> --duration-seconds 3600
Then export the temporary credentials to your environment variables.
export AWS_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID>
export AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY>
export AWS_SESSION_TOKEN=<YOUR_SESSION_TOKEN>

Assuming a Role #

If you need to assume a role (e.g. for cross-account access), configure the role_to_assume field in the spec.
role_to_assume:
  arn: arn:aws:iam::123456789012:role/YourRole # required
  session_name: YourSessionName # optional
  external_id: YourExternalId # optional

Configuration #

To configure CloudQuery to extract Cost Usage Reports data, create a .yml file in your CloudQuery configuration directory with the following configuration.
kind: source
spec:
  name: awscur
  path: cloudquery/awscur
  version: "v1.0.3"
  tables: ["*"]
  destinations: ["postgresql"]

  spec:
    bucket: "<BUCKET_NAME>"
    region: "<REGION>"
    reports:
      - path: "<PATH_PREFIX_1>"
        name: "my-report-v1"
Based on the configuration, the plugin will sync all parquet files in the defined prefix, to a table named my-report-v1.
The plugin supports both legacy and 2.0 cost usage report formats, as long as those are synced as separate reports.

Incremental Syncing #

The AWS CUR plugin supports incremental syncing. This means that only new files will be fetched from S3 and loaded into your destination. This is done by keeping track of the time of the last sync and comparing it against the last modified date of each file to only fetch new files. This assumes that S3 files are immutable. To enable this, backend_options must be set in the spec (as shown below). This is documented in the Managing Incremental Tables section.

Configuration #

kind: source
spec:
  name: awscur

  path: cloudquery/awscur
  registry: cloudquery
  version: "v1.0.3"
  tables: ["*"]
  destinations: ["postgresql"]
  
  backend_options:
    table_name: "cq_state_awscur"
    connection: "@@plugins.postgresql.connection"

  spec:
    bucket: "<BUCKET_NAME>"
    region: "<REGION>"
    
    # Optional parameters
    # path_prefix: ""
    # rows_per_record: 500
    # concurrency: 50

Spec #

This is the (nested) spec used by the AWS CUR source plugin.
  • bucket (string) (required)
    The name of the S3 bucket that contains cost usage report files.
  • region (string) (required)
    The AWS region of the S3 bucket.
  • reports ([]Report) (required)
    A list of reports to sync.
  • local_profile (string) (optional) (default: will use current credentials)
    Local profile to use to authenticate this account with. Please note this should be set to the name of the profile.
    For example, with the following credentials file:
    [default]
    aws_access_key_id=xxxx
    aws_secret_access_key=xxxx
    
    [user1]
    aws_access_key_id=xxxx
    aws_secret_access_key=xxxx
    local_profile should be set to either default or user1.
  • rows_per_record (integer) (optional) (default: 500)
    Amount of rows to be packed into a single Apache Arrow record to be sent over the wire during sync.
  • concurrency (integer) (optional) (default: 50)
    Number of objects to sync in parallel. Negative values mean no limit.
  • role_to_assume (RoleToAssume)
    If specified will use this to assume the role for access to the S3 bucket.

Report #

  • path (string) (required)
    The path prefix that will limit the files to sync for this report.
  • name (string) (optional) (default: path prefix value)
    The table name to use for the report, defaults to the path prefix value. The name will be sanitized to ensure it is a valid table name.

RoleToAssume #

  • arn (string) (required)
    The ARN of the role to assume.
  • session_name (string) (optional)
    The session name to use when assuming the role.
  • external_id (string) (optional)
    The external ID to use when assuming the role.


Licenses #

The following tools / packages are used in this plugin:
NameLicense
github.com/adrg/xdgMIT
github.com/andybalholm/brotliMIT
github.com/apache/arrow-go/v18Apache-2.0
github.com/apache/arrow/go/v13Apache-2.0
github.com/apache/thrift/lib/go/thriftApache-2.0
github.com/apapsch/go-jsonmerge/v2MIT
github.com/aws/aws-sdk-go-v2Apache-2.0
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstreamApache-2.0
github.com/aws/aws-sdk-go-v2/configApache-2.0
github.com/aws/aws-sdk-go-v2/credentialsApache-2.0
github.com/aws/aws-sdk-go-v2/feature/ec2/imdsApache-2.0
github.com/aws/aws-sdk-go-v2/internal/configsourcesApache-2.0
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2Apache-2.0
github.com/aws/aws-sdk-go-v2/internal/iniApache-2.0
github.com/aws/aws-sdk-go-v2/internal/sync/singleflightBSD-3-Clause
github.com/aws/aws-sdk-go-v2/internal/v4aApache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/accept-encodingApache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/checksumApache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/presigned-urlApache-2.0
github.com/aws/aws-sdk-go-v2/service/internal/s3sharedApache-2.0
github.com/aws/aws-sdk-go-v2/service/licensemanagerApache-2.0
github.com/aws/aws-sdk-go-v2/service/marketplacemeteringApache-2.0
github.com/aws/aws-sdk-go-v2/service/s3Apache-2.0
github.com/aws/aws-sdk-go-v2/service/ssoApache-2.0
github.com/aws/aws-sdk-go-v2/service/ssooidcApache-2.0
github.com/aws/aws-sdk-go-v2/service/stsApache-2.0
github.com/aws/smithy-goApache-2.0
github.com/aws/smithy-go/internal/sync/singleflightBSD-3-Clause
github.com/cenkalti/backoff/v4MIT
github.com/cloudquery/cloudquery-api-goMPL-2.0
github.com/cloudquery/plugin-pb-goMPL-2.0
github.com/cloudquery/plugin-sdk/v2/internal/globMIT
github.com/cloudquery/plugin-sdk/v2/schemaMIT
github.com/cloudquery/plugin-sdk/v2/typesMPL-2.0
github.com/cloudquery/plugin-sdk/v4MPL-2.0
github.com/cloudquery/plugin-sdk/v4/globMIT
github.com/cloudquery/plugin-sdk/v4/scalarMIT
github.com/davecgh/go-spew/spewISC
github.com/ghodss/yamlMIT
github.com/go-logr/logrApache-2.0
github.com/go-logr/stdrApache-2.0
github.com/goccy/go-jsonMIT
github.com/golang/snappyBSD-3-Clause
github.com/google/flatbuffers/goApache-2.0
github.com/google/uuidBSD-3-Clause
github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptorsApache-2.0
github.com/grpc-ecosystem/grpc-gateway/v2BSD-3-Clause
github.com/hashicorp/go-cleanhttpMPL-2.0
github.com/hashicorp/go-retryablehttpMPL-2.0
github.com/klauspost/compressApache-2.0
github.com/klauspost/compress/internal/snaprefBSD-3-Clause
github.com/klauspost/compress/zstd/internal/xxhashMIT
github.com/klauspost/cpuid/v2MIT
github.com/mattn/go-colorableMIT
github.com/mattn/go-isattyMIT
github.com/oapi-codegen/runtimeApache-2.0
github.com/pierrec/lz4/v4BSD-3-Clause
github.com/pmezard/go-difflib/difflibBSD-3-Clause
github.com/rs/zerologMIT
github.com/santhosh-tekuri/jsonschema/v6Apache-2.0
github.com/spf13/cobraApache-2.0
github.com/spf13/pflagBSD-3-Clause
github.com/stretchr/testifyMIT
github.com/thoas/go-funkMIT
github.com/zeebo/xxh3BSD-2-Clause
go.opentelemetry.io/auto/sdkApache-2.0
go.opentelemetry.io/otelApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttpApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttpApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptraceApache-2.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttpApache-2.0
go.opentelemetry.io/otel/logApache-2.0
go.opentelemetry.io/otel/metricApache-2.0
go.opentelemetry.io/otel/sdkApache-2.0
go.opentelemetry.io/otel/sdk/logApache-2.0
go.opentelemetry.io/otel/sdk/metricApache-2.0
go.opentelemetry.io/otel/traceApache-2.0
go.opentelemetry.io/proto/otlpApache-2.0
golang.org/x/expBSD-3-Clause
golang.org/x/netBSD-3-Clause
golang.org/x/sync/errgroupBSD-3-Clause
golang.org/x/sysBSD-3-Clause
golang.org/x/textBSD-3-Clause
golang.org/x/xerrorsBSD-3-Clause
google.golang.org/genproto/googleapis/api/httpbodyApache-2.0
google.golang.org/genproto/googleapis/rpc/statusApache-2.0
google.golang.org/grpcApache-2.0
google.golang.org/protobufBSD-3-Clause
gopkg.in/yaml.v2Apache-2.0
gopkg.in/yaml.v3MIT



© 2025 CloudQuery, Inc. All rights reserved.