AWS Components

Granular Explanations of Virtual Appliance Components

The GRAX Virtual Appliance is a fully automated way to run the GRAX data platform inside an AWS account that you control. This allows you to control VPC ingress/egress and Elastic and S3 API access to guarantee the security and chain of custody of your sensitive data.

Customers have some flexibility about which parts of this service they can customise during the deployment of the GRAX Virtual Appliance.

  • Domain name and certificate
  • ALB
  • VPC / security groups

During a deployment of the GRAX Virtual Appliance you also have the option to "tune" parameters to support larger data volumes or utilize reserved instances:

  • Instance size
  • Disk size
  • Bucket name

The GRAX Virtual Appliance will not support customisation of:

  • AMI
  • Instance user data / scripts
  • KMS
  • IAM policies

This is all configured to be secure out of the box and for GRAX to manage automated updates for additional security, architecture and cost improvements.

To fit into your existing AWS practices, the GRAX Virtual Appliance allows you to "extend" AWS in conventional ways like:

  • Create IAM users for access to the S3 buckets
  • Add S3 bucket replication policies
  • Add CloudWatch log forwarding, and metrics monitoring and alerting

GRAX - Detailed Network Diagram

Shown here is the network diagram of the standard GRAX components that make up a GRAX Runtime.
All network communications are completed via HTTPS with a public certificate on external traffic and a self-signed certificate for communication behind the ALB.

GRAX - Detailed Network Diagrams - AWS virutal appliance network (1).pngGRAX - Detailed Network Diagrams - AWS virutal appliance network (1).png

The GRAX Virtual Appliance does not allow public access to any storage or logical resources by default. The only publicly accessible component in a default install is the application’s load balancer which services requests over HTTPS. The default web application firewall will deny all traffic from outside Salesforce which looks to access data while allowing some traffic from the public on obfuscated, non-data endpoints which are vital to service requests related to embedded GRAX functionality.

At the foundation of a new GRAX deployment is the network configuration; this is how GRAX ensures that data and access to that data stays secure to only the entities that require it. The GRAX infrastructure is designed to be operated inside a VPC and this is the base primitive on which the rest of the network services are configured. Inside this isolated network space, we deploy both public and private subnets managed across two different availability zones within the AWS region chosen at runtime deployment.

GRAX supports deployment of the service into most AWS Regions. Contact your GRAX representative if you have questions about a particular deployment region.

GRAX - AWS Services Deployed

As you can see from the diagram above, GRAX can be described as an API that fetches data into storage. As you would expect, the main components for this are Compute, Data Storage and Search.
The following is the list of AWS resources that a GRAX Virtual Appliance requires when deployed in the default state (shown above). Customers can choose to configure some of the resources below during provisioning if they wish, please contact your GRAX Support Representative for further information.

S3

A single S3 bucket is provisioned by GRAX templates. It is required by the GRAX Virtual Appliance that this templated bucket is used as the object storage. No bring-your-own-bucket options are currently available for this product.

This object store is the place where your customer data is held at rest. The S3 resource is configured with delete protection and encryption at rest. Data within the bucket does not have an expiration policy set. This is a resource that will grow over time as you continue GRAX usage, so the size and cost of this resource will depend on the size of your Salesforce data set as well as the frequency of operations configured in the GRAX Application.

By default, bucket encryption is managed using an Amazon S3-managed encryption key. If you wish to use KMS keys for this, please contact your GRAX representative.

S3 is also used by the managed Aurora and Elasticsearch services to keep data snapshots on platform. These buckets are managed by Amazon and are mandatory resources in this configuration.

Compute

The compute runtime is an EC2 instance that is managed by an Autoscaling Group. This AutoScaling group is defaulted to a maximum of 1 instance. At least one EC2 instance is required for GRAX to operate.

The EC2 instance is launched using the latest Amazon Linux 2 AMI. This AMI is mandatory and not configurable.

When configuring the GRAX Virtual Appliance you have options for specifying the size of the EC2 reosurce. GRAX recommends a m5.xlarge instance type for a standard production or full-copy sandbox app, a r5.xlarge for small or partial sandbox app, or a m5.4xlarge when servicing a very large org. This is configured with the InstanceType parameter in the CloudFormation template, which defaults to m5.xlarge. Your GRAX Support Engineer will be able to offer advice that is relevant to your particular use case.

An EBS volume is attached as storage and is encrypted using Amazon EBS Encryption with default keys. In the current version of the GRAX Virtual Appliance there is not an option to include your own key for this service.

The attached EBS resource is treated as temporary storage and has the potential to hold customer data from the CRM for short periods of time. GRAX recommends a 500 GB gp2 EBS volume for a standard production or full-copy sandbox backups, and 150 GB gp2 volume for a small sandbox backup. This is managed with the VolumeSize parameter in the CloudFormation template, which defaults to 500. The volume type defaults to gp2 and is not configurable.

Postgres

GRAX creates an RDS database cluster using the Aurora PostgreSQL 10.11 engine. This database is used to store metadata about GRAX jobs, summary job results and other system related tasks.

No data from CRM backups is stored in this service.

Given the nature of the data stored in this database (simple configuration and job data for the operation of the service), there is no encryption enabled for this service. GRAX recommends a db.t3.medium RDS instance type for all types of deployments (production, full-copy, partial, etc.)

Elasticsearch

GRAX comes ready-built with a full text search service enabled. The deployment uses the AWS Elasticsearch service and this service will contain a copy of the latest version of each record for each Salesforce object that is backed up. If a backed up record is deleted by an archive process, the latest version is maintained in this search index but flagged as a deleted record.

The service is configured to require HTTPS and node to node encryption and uses encryption at rest. The encryption key is generated and managed as an AWS Managed key.

GRAX recommends the c5.large.elasticsearch Elastic instance type, a 750 GB gp2 EBS volume, 2 data nodes, and 3 dedicated master nodes in the cluster. Work with your GRAX Support Engineer if you wish to have tailored advice to your specific Salesforce deployments.

This service contains customer data and is mandatory for the deployment of the GRAX Virtual Appliance.

Route 53

An AWS Route 53 domain record is created and referenced by the GRAX Salesforce managed package to access the GRAX API or to load UI elements. Creating this Route53 record must be done manually prior to creation of an ALB.

This component is utilized within a standard GRAX Virtual Appliance deployment but is optional if deploying using a custom networking setup which circumvents Route53.

IAM

GRAX templates create the following IAM resources:

  • S3 User and Policy
  • Elasticsearch User, Policy and ServiceLinkedRole
  • Instance Profile and Role
  • LogsPolicy
  • Secrets Manager Policy

These IAM resources are the minimum set of resources and are mandatory for the operation of the GRAX Virtual Appliance.

Secrets Manager

Secrets are created for the API, Postgres, Elasticsearch, and S3 services and stored within the AWS Secrets Manager service.

The values that GRAX saves are automatically generated values of internal importance and should never require changing by the customer. GRAX does not currently support automatic rotation of any of these secrets. Changing any value in Secrets Manager is not supported when using the GRAX Virtual Appliance.

After install, only the application instance role requires access to secrets manager.

The use of AWS Secrets Manager is mandatory for the operation of the GRAX Virtual Appliance.

Cloudwatch

GRAX creates a log group and single log stream that can be monitored for application logging. Redeployment of the GRAX instance will generate a new log stream, ended the previous stream.

The use of AWS Cloudwatch is mandatory for the operation of the GRAX Virtual Appliance.

WAF

GRAX configures a AWS WAF v2 by default. This whitelists the Salesforce IP addresses required for the Apex callouts to GRAX. It also whitelists the UI element requests (configuration pages, lightning web components, etc.) as these calls occur from the user's browser.

This component is included by default with a standard GRAX Virtual Appliance deployment but is optional if deploying using a custom networking setup.

Application Load Balancer

The GRAX Virtual Appliance has a single Application Load Balancer provisioned with a target group that resolves to the EC2 service. There is an HTTPS only listener attached; the load balancer will not acknowledge HTTP requests. The WAF is also associated with this ALB.

This component is included by default with a standard GRAX Virtual Appliance deployment but is optional if deploying using a custom setup.

Internet Gateway

An AWS Internet Gateway is provisioned to give the GRAX Application a pathway to call out to the Salesforce API. It also allows for the transmission of telemetry data to the GRAX Control Plane, such as job status.

Virtual Private Cloud

An AWS VPC is provisioned as the dedicated network for the GRAX resources to operate within. The default configuration is shown in the above diagram; this can be configured if deploying using a custom setup.

Public Subnet

Two public subnets are configured to sit across different Availability Zones within the single region.
Subnets are configurable if deploying using a custom setup.

Private Subnets

Four private subnets are configured in different Availability Zones within the single region.
Subnets are configurable if deploying using a custom setup.

NAT Gateway

NAT Gateways are provided for each of the public subnets. Each has an Elastic IP associated with it.