We just raised $3.5M and we are hiring!

AWS Provider

AWS Provider extends CloudQuery with ability to fetch information on AWS cloud resources and store it in PostgreSQL database.

$ cloudquery init aws
Category
Public Cloud
Version
v0.9.1
License
MPL-2.0
Published at
Thu Jan 13 2022

The CloudQuery AWS provider extracts and transforms your AWS cloud assets configuration into PostgreSQL.

This provider also supports additional capabilities:

Install

cloudquery init aws

Authentication

To authenticate CloudQuery with your AWS account you can use any of the following options (see full documentation at AWS SDK V2):

  • Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_PROFILE
  • Shared configuration files (via aws configure).
    • SDK defaults to credentials file under .aws folder that is placed in the home folder on your computer.
    • SDK defaults to config file under .aws folder that is placed in the home folder on your computer.
    • SDK is able to use SSO credentials stored in the ~/.aws/ directory
  • The SDK is able to use the IAM role associated with AWS Compute resources including (EC2 instances, Fargate and ECS containers, and Lambda Functions)

Configuration

The following configuration section can be automatically generated by cloudquery init aws:

provider "aws" { configuration { // Optional. if you want to assume role to multiple account and fetch data from them //accounts "<YOUR ID>" { // Optional. Role ARN we want to assume when accessing this account // role_arn = <YOUR_ROLE_ARN> // } // Optional. by default assumes all regions // regions = ["us-east-1", "us-west-2"] // Optional. Enable AWS SDK debug logging. aws_debug = false // The maximum number of times that a request will be retried for failures. Defaults to 5 retry attempts. // max_retries = 5 // The maximum back off delay between attempts. The backoff delays exponentially with a jitter based on the number of attempts. Defaults to 60 seconds. // max_backoff = 30 } resources = ["*"] }

By default, CloudQuery will fetch all configuration from all resources in all regions in the default account. You can change this behavior with the following arguments:

Arguments

  • accounts (Optional) - Specify multiple accounts to fetch data from them concurrently and then query across accounts. The default configured account should be able AssumeRole to the specified accounts.
  • regions (Optional) - limit fetching to specific regions.
  • max_retries (Optional) - The maximum number of times that a request will be retried for failures. Defaults to 5 retry attempts.
  • max_backoff (Optional) - The maximum back off delay between attempts. The backoff delays exponentially with a jitter based on the number of attempts. Defaults to 60 seconds.
  • aws_debug (Optional) - This will print very verbose/debug output from AWS SDK. Defaults to false.

Assume Role

CloudQuery can fetch from multiple accounts in parallel by using AssumeRole (You will need to use credentials that can AssumeRole to all other specified account. Following is an example configuration:

provider "aws" { configuration { // Optional. if you want to assume role to multiple account and fetch data from them accounts "<AccountID_1>" { Optional. Role ARN we want to assume when accessing this account role_arn = <YOUR_ROLE_ARN_1> } accounts "<AccountID_2>" { Optional. Role ARN we want to assume when accessing this account role_arn = <YOUR_ROLE_ARN_2> } } resources = ["*"] }

Assume Role with MFA

In order to assume role with MFA, you need to request temporary credentials using STS "get-session-token".

aws sts get-session-token --serial-number <YOUR_MFA_SERIAL_NUMBER> --token-code <YOUR_MFA_TOKEN_CODE> --duration-seconds 3600

export the temporary credentials to your environment variables.

export AWS_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID> export AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY> export AWS_SESSION_TOKEN=<YOUR_SESSION_TOKEN>

Query Examples

Find all public facing load balancers

SELECT * FROM aws_elbv2_load_balancers WHERE scheme = 'internet-facing';

Find all unencrypted RDS instances

SELECT * FROM aws_rds_clusters WHERE storage_encrypted IS FALSE;

Find all unencrypted buckets

SELECT * FROM aws_rds_clusters WHERE storage_encrypted IS FALSE;

Building the Provider:

make build

Running Provider locally:

  1. Clone repository to local machine

  2. [Optional] Start a local database:

    make pg-start
  3. [Optional] Configure the config.hcl

    make os=Linux arch=arm64 install ./cloudquery init aws
  4. Start the provider in a different tab/session

    make run
  5. Execute Cloudquery Fetch using the locally built provider

    make fetch

More information can be found in the CloudQuery documentation