We just raised $3.5M and we are hiring!

AWS Provider

AWS Provider extends CloudQuery with ability to fetch information on AWS cloud resources and store it in PostgreSQL database.

$ cloudquery init aws
Category
Public Cloud
Version
v0.11.5
License
MPL-2.0
Published at
Wed May 11 2022

The CloudQuery AWS provider extracts and transforms your AWS cloud assets configuration into PostgreSQL.

This provider also supports additional capabilities:

Install

cloudquery init aws

Authentication

CloudQuery needs to be authenticated with your AWS account in order to fetch information about your cloud setup. CloudQuery requires only read permissions (we will never make any changes to your cloud setup), so, following the principle of least privilege, it's recommended to grant it read-only permissions.

There are multiple ways to authenticate with AWS, and CloudQuery respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:

  • The AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN environment variables.
  • The credentials and config files in ~/.aws (the credentials file takes priority).
  • IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers).

You can read more about AWS authentication here and here.

Environment Variables

CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables (AWS_SESSION_TOKEN can be optional for some accounts). For information on obtaining credentials, see the AWS guide.

To export the environment variables (On Linux/Mac - similar for Windows):

export AWS_ACCESS_KEY_ID={Your AWS Access Key ID} export AWS_SECRET_ACCESS_KEY={Your AWS secret access key} export AWS_SESSION_TOKEN={Your AWS session token}

Shared Configuration files

CloudQuery can use credentials from your credentials and config files in the .aws directory in your home folder. The contents of these files are practically interchangeable, but CloudQuery will prioritize credentials in the credentials file.

For information about obtaining credentials, see the AWS guide.

Here are example contents for a credentials file:

[default] aws_access_key_id = <YOUR_ACCESS_KEY_ID> aws_secret_access_key = <YOUR_SECRET_ACCESS_KEY>

You can also specify credentials for a different profile, and instruct CloudQuery to use the credentials from this profile instead of the default one.

For example:

[myprofile] aws_access_key_id = <YOUR_ACCESS_KEY_ID> aws_secret_access_key = <YOUR_SECRET_ACCESS_KEY>

Then, you can either export the AWS_PROFILE environment variable (On Linux/Mac, similar for Windows):

export AWS_PROFILE=myprofile

or, configure your desired profile in the local_profile field of your CloudQuery config.hcl:

provider "aws" { configuration { accounts "<account_alias>" { local_profile = "myprofile" } ... } ... }

IAM Roles for AWS Compute Resources

Cloudquery can use IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers). If you configured your AWS compute resources with IAM, cloudquery will use these roles automatically! You don't need to specify additional credentials manually. For more information on configuring IAM, see the AWS docs here and here.

Configuration

The following configuration section can be automatically generated by cloudquery init aws:

provider "aws" { configuration { // Optional. if you want to assume role to multiple account and fetch data from them // Optional. by default assumes all regions or explicitly state all regions by including the `*` character as the only argument in the array // regions = ["us-east-1", "us-west-2"] // accounts "<YOUR ID>" { // Optional. Role ARN we want to assume when accessing this account // role_arn = < YOUR_ROLE_ARN > // Optional. Override provider configs for a specific account // regions = ["us-east-1", "us-east-2"] } // Optional. Enable AWS SDK debug logging. // aws_debug = false // The maximum number of times that a request will be retried for failures. Defaults to 20 retry attempts. // max_retries = 10 // The maximum back off delay between attempts. The backoff delays exponentially with a jitter based on the number of attempts. Defaults to 90 seconds. // max_backoff = 30 } resources = ["*"] }

By default, CloudQuery will fetch all configuration from all supported resources in all commercial regions in the default account. You can change this behavior with the following arguments:

Arguments for AWS Provider block

  • accounts (Optional, Repeated) - Specify multiple accounts to fetch data from them concurrently and then query across accounts. The default configured account should be able AssumeRole to the specified accounts. You can have multiple accounts blocks.
  • regions (Optional) - limit fetching to specific regions. You can specify all regions by using the * character as the only argument in the array
  • max_retries (Optional) - The maximum number of times that a request will be retried for failures. Defaults to 10 retry attempts.
  • max_backoff (Optional) - The maximum back off delay between attempts. The backoff delays exponentially with a jitter based on the number of attempts. Defaults to 30 seconds.
  • aws_debug (Optional) - This will print very verbose/debug output from AWS SDK. Defaults to false.

Multi Account Configuration

AWS Organizations:

CloudQuery supports discovery of AWS Accounts via AWS Organizations. This means that as Accounts get added or removed from your organization CloudQuery will be able to handle new or removed accounts without any configuration changes.

Prerequisites for using AWS Org functionality:

  1. Have a role (or user) in an Admin account with the following access:
  • organizations:ListAccounts
  • organizations:ListAccountsForParent
  • organizations:ListChildren
  1. Have a role in each child account that has a trust policy with a single principal. The default profile name is OrganizationAccountAccessRole. More information can be found here, including how to create the role if it doesn't already exist in your account.

Using AWS Organization:

  1. Specify member role name:

org { member_role_name = "OrganizationAccountAccessRole" }
  1. Getting credentials that have the necessary organizations permissions:

    1. Sourcing Credentials from the default credential tool chain:

    org { member_role_name = "OrganizationAccountAccessRole" }
    1. Sourcing credentials from a named profile in the shared configuration or credentials file

    org { member_role_name = "OrganizationAccountAccessRole" admin_account "admin" { local_profile = "<Named-Profile>" } }
    1. Assuming a role in admin account using credentials in the shared configuration or credentials file:

    org { member_role_name = "OrganizationAccountAccessRole" admin_account "admin" { local_profile = "<Named-Profile>" role_arn = "arn:aws:iam::<ACCOUNT_ID>:role/<ROLE_NAME>" // Optional. Specify the name of the session // role_session_name = "" // Optional. Specify the ExternalID if required for trust policy // external_id = " } }
  2. Optional. If the trust policy configured for the member accounts requires different credentials than you configured in the previous step, then you can specify the credentials to use in the member_trusted_principal block

org { member_role_name = "OrganizationAccountAccessRole" admin_account "admin" { local_profile = "<Named-Profile>" } member_trusted_principal "trusted" { } organization_units = ["ou-<ID-1>","ou-<ID-2>"] }
  1. Optional. If you want to specify specific Organizational Units to fetch from you can add them to the organization_units list.

org { member_role_name = "OrganizationAccountAccessRole" admin_account "admin" { local_profile = "<Named-Profile>" } organization_units = ["ou-<ID-1>","ou-<ID-2>"] }

note: If you specify an OU, CloudQuery will not traverse child OUs

Arguments for Org block:

  • organization_units (Optional) - List of Organizational Units that CloudQuery should use to source accounts from
  • admin_account (Optional) - Configuration on how to grab credentials from an Admin account
  • member_trusted_principal (Optional) - Configuration on how to specify the principle to use in order to assume a role in the member accounts
  • member_role_name (Required) - Role name that CloudQuery should use to assume a role in the member account from the admin account. Note: This is not a full ARN, it is just the name
  • member_role_session_name (Optional) - Override the default Session name.
  • member_external_id (Optional) - Specify an ExternalID for use in the trust policy
  • member_regions (Optional) - Limit fetching resources within this specific account to only these regions. This will override any regions specified in the provider block. You can specify all regions by using the * character as the only argument in the array

Multi Account- Specific Accounts

CloudQuery can fetch from multiple accounts in parallel by using AssumeRole (You will need to use credentials that can AssumeRole to all other specified account. Following is an example configuration:

provider "aws" { configuration { // Optional. if you want to assume role to multiple account and fetch data from them accounts "<AccountID_Alias_2>" { // Optional. Role ARN we want to assume when accessing this account role_arn = <YOUR_ROLE_ARN_1> // Optional. Local Profile is the named profile in your shared configuration file (usually `~/.aws/config`) that you want to use for this specific account local_profile = "<NAMED_PROFILE> // Optional. Specify the Role Session name role_session_name = "" } accounts "<AccountID_Alias_2>" { // Optional. Role ARN we want to assume when accessing this account role_arn = <YOUR_ROLE_ARN_2> } } resources = ["*"] }

Arguments for Accounts block:

  • role_arn (Optional) - The role that CloudQuery will use to perform the fetch
  • local_profile (Optional) - Local Profile is the named profile in your shared configuration file (usually ~/.aws/config) that you want to use for the account
  • external_id (Optional) - The unique identifier used to by non aws entities to assume a role in an AWS account
  • role_session_name (Optional) - Override the default Session name.
  • regions (Optional) - Limit fetching resources within this specific account to only these regions. This will override any regions specified in the provider block. You can specify all regions by using the * character as the only argument in the array

Assume Role with MFA

In order to assume role with MFA, you need to request temporary credentials using STS "get-session-token".

aws sts get-session-token --serial-number <YOUR_MFA_SERIAL_NUMBER> --token-code <YOUR_MFA_TOKEN_CODE> --duration-seconds 3600

export the temporary credentials to your environment variables.

export AWS_ACCESS_KEY_ID=<YOUR_ACCESS_KEY_ID> export AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_ACCESS_KEY> export AWS_SESSION_TOKEN=<YOUR_SESSION_TOKEN>

Query Examples

Find all public facing load balancers

SELECT * FROM aws_elbv2_load_balancers WHERE scheme = 'internet-facing';

Find all unencrypted RDS instances

SELECT * FROM aws_rds_clusters WHERE storage_encrypted IS FALSE;

Find all s3 buckets that are able to be public

SELECT arn, region FROM aws_s3_buckets WHERE block_public_acls IS NOT TRUE OR block_public_policy IS NOT TRUE OR ignore_public_acls IS NOT TRUE OR restrict_public_buckets IS NOT TRUE

Building the Provider:

make build

Running Provider locally:

  1. Clone repository to local machine

  2. [Optional] Start a local database:

    make pg-start
  3. [Optional] Configure the config.hcl

    make os=Linux arch=arm64 install ./cloudquery init aws
  4. Start the provider in a different tab/session

    make run
  5. Execute Cloudquery Fetch using the locally built provider

    make fetch

More information can be found in the CloudQuery documentation