Storage Format Considerations

Moving Ongoing Backups and Historical Data to Tables

As part of the ongoing effort to optimize, improve, and simplify the customer experience, GRAX has released a proprietary internal data storage layer. Data records will no longer be stored in individual, human-readable JSON objects or CSVs. The new release instead moves record storage to a compressed, deduplicated, and write-optimized storage layer built by GRAX engineers and based on leading research and engineering in the big data market. This storage layer retains all versions of every record you've ever backed up with GRAX, including their relationships and restore lineage.

Benefits

  • Single source of truth for all record histories
  • Dataset compression to use less storage space
  • Faster object backups
  • All new and improved Search and Restore
  • All future features will utilize this storage layer as their backbone

Setting Expectations

Migration Expectations

Your legacy backed up and archived data will need to be migrated to the new format.

  • The migration will not impact your current running jobs or the performance of GRAX
  • The migration may cause a spike in your storage volume and cost on a short term basis due to the migration read/write activities
  • Once the migration is complete, storage costs and volume will be reduced back to their regular monthly size/rate
    • Natural growth of storage will continue due to consistent backup/archive jobs

Expectations for using the new storage format

Turning on the new storage configuration is done in coordination with the GRAX Support team and will be effective immediately on all future GRAX jobs:

  • All new backups will be written to the new storage format
  • All new archives will be written to the new storage format
  • All data by ID will be available for Search and Restore once the migration of the legacy data is complete

FAQs

Why did GRAX make this change?

Longterm data storage and accumulation has costs not just to the bucket owner but to the backup system in general. Storing separate objects and CSVs was inefficient, difficult to organize, and had a low performance ceiling. It also meant that datasets were difficult to compress or optimize due to each piece of data being stored separately. This change keeps costs low, performance high, and new features on the way.

Does this mean that we can access our data even if Salesforce is down?

Backed up data is available via the GRAX webapp interface for all users as long as Salesforce OAuth is functional, or to manually added local admin accounts if Salesforce OAuth is unavailable.

What are the technical differences between the old and the new format?

One of the core technical challenges of GRAX is that the Salesforce Data APIs generally return partial and unordered object data, then GRAX needs to store this data in a way that enables you to find complete object data at any point in time.

The old storage format would save "chunks" of comma-separated values (CSVs) of partial and unordered data. Because these files are unstructured, it is inefficient to find specific object data purely from storage. Therefore GRAX maintained an additional index of all object data in ElasticSearch for point-in-time recovery. This incurred operational complexity and cost, feature stability issues due to infrastructure stability issues, and a significant delay in the backup pipeline.

The new format uses "key/value databases" of complete and ordered data. Because these files are highly structured, it is now efficient to find specific object data by ID and timestamp directly from storage. This lets GRAX use the data in storage for point-in-time recovery and eliminates the need for ElasticSearch entirely.

This new format unlocks new product capabilities and greatly simplifies architecture.

Does it matter what storage I am currently using?

It does not matter whether you're using AWS S3 or Azure Blob storage, we can still help you migrate to this new storage format and take advantage of all the new format has to offer.

How long does it take to migrate?

This is dependent on the size of data you have already backed up and archived over your tenure with GRAX. Rough estimations can be provided by the GRAX Support Team once the process is kicked off and has run for 24 hours. Please keep in mind that the new format can be turned on immediately so that you can take advantage of the optimizations right away on newly backed up data.

When do I need to migrate by?

We are working with all our customers to ensure that they are enabled and taking advantage of the new storage format. There is a queue for the migrations that we are also working through and will keep you posted once your migration has started.

How do I get on the new storage format?

If you are not already working with your Customer Success Manager, please reach out to [email protected] to get started.

Have more questions or need additional help?

We're always here to help. Email [email protected] to get in touch with a technical expert.