Files

Let's take a look at the various ways you can interact with Salesforce attachments, files and other similar objects across the various GRAX features.

Introduction

Throughout this guide, the term 'Files' will be used as a catch-call for all file-related Salesforce objects that GRAX supports. Salesforce itself uses a variety of different, and sometimes confusing names, such as Chatter Files, Content, Attachment etc. For now, we'll stick with 'Files' to represent the overarching category and any specific objects within this category will be called out.

📘

The complexity with Files is that not only does GRAX need to back up the data (meta information such as Name, CreatedDate, SystemModstamp, custom fields etc.) but the actual binary must be downloaded and uploaded to your storage provider.

GRAX currently supports 3 main "File" objects:

  • Attachment
  • Content - this is not an object name, but instead represents 3 specific objects that are typically dealt with together:
    • ContentDocument
    • ContentVersion
    • ContentDocumentLink
  • EventLogFile

Files: Backup

🚧

Version Notice

The information here applies to GRAX Versions 3.50 and above. If you still are using a version below 3.50, please reach out to GRAX Support to upgrade or understand how behavior may differ in these prior versions.

Backing up Files with GRAX requires you to create a separate job that only includes one or more File-related objects. Use the Backup Type dropdown selector for this.

We select `Files` as the option in the `Backup Type` dropdown to display the list of possible objects.We select `Files` as the option in the `Backup Type` dropdown to display the list of possible objects.

We select Files as the option in the Backup Type dropdown to display the list of possible objects.

📘

Where are the rest of my "Content" objects?

  • For Content objects, please use the provided ContentDocument object, which will allow you to filter on ContentDocument and will automatically back up the latest ContentVersion (represents the binary) as well as all related ContentDocumentLinks.

  • ContentNote is automatically included as part of your ContentDocument query. These are what Salesforce refers to as 'enhanced notes'. You may or may not have them enabled in your environment.

Best Practices

It is critical to read through the recommendations here before kicking off any File backups, as there are various Salesforce gotchas and considerations you will want to confirm.

❗️

File Backup Considerations and Recommendations

  • GRAX will use 1 REST API call per each File in order to download the binary.

  • File backups will be slower than data object backups, due to the fact that in addition to the "data" portion of the record, GRAX must individually download and upload each binary to storage.

  • Given the above considerations on API usage and speed, GRAX recommendations limiting the number of File records you back up per job. Chunk up your backup jobs in ranges by an indexed audit date field such as SystemModstamp. One easy rule of thumb is start by running 1 job at a time that includes a max of 500,000 records across the selected File objects. Of course, depending on your API credit availability, this may differ for each customer, but 500,000 records is the recommended max per job.

  • Ensure all relevant GRAX users have the Query All Files Salesforce permission, otherwise only files related to this user will be backed up and it could result in data integrity issues. Click here for more on GRAX permissions.

Summary Page

When clicking on the Summary Link for a Files type backup, you will notice some additional columns. The first three columns represent the same information as always. This will show how many records were provided by Salesforce versus how many were inserted/updated into GRAX. With Files, however, you need to remember that there is also an actual file binary that must be downloaded from Salesforce and uploaded to your specified storage provider. The last two columns represent this information. Additionally, you will see information about how many REST API calls were used.

Binaries Processed shows how many file binaries were successfully downloaded and subsequently uploaded to storage. Note that this number may not always exactly match the number that were queried from Salesforce as GRAX will only process binaries if it does not already have the latest version.

Unable to Process represents files that GRAX was not able to successfully process, likely due to either the file having been deleted out of Salesforce before GRAX could download, or some other issue such as a corrupted file or another error in the download/upload process that could not be resolved after GRAX retries.

🚧

When you select ContentDocument in a backup, as mentioned above, GRAX will automatically query all related ContentDocumentLinks as well as the latest ContentVersion. On the Summary page, you will only see a number for Binaries Processed in the row for ContentDocument.

In this example, you can also see the approximate Salesforce API usage per object.  For the Attachment backup, note that it is roughly equal to how many Attachments were inserted in GRAX.  Since most of the others already existed in GRAX, they were updates and GRAX did not have to expend an API call to re-download the binary.In this example, you can also see the approximate Salesforce API usage per object.  For the Attachment backup, note that it is roughly equal to how many Attachments were inserted in GRAX.  Since most of the others already existed in GRAX, they were updates and GRAX did not have to expend an API call to re-download the binary.

In this example, you can also see the approximate Salesforce API usage per object. For the Attachment backup, note that it is roughly equal to how many Attachments were inserted in GRAX. Since most of the others already existed in GRAX, they were updates and GRAX did not have to expend an API call to re-download the binary.

Web Links in Content Libraries

When you upload files, ContentVersion is the object that contains the actual binary that GRAX will download and upload to your storage bucket. Salesforce determines file types: CSV, EXCEL, LINK, PDF, PNG, TEXT, WORD, ZIP, UNKNOWN. The filetype can be a LINK when you contribute a web link via the Libraries feature. In this case, there is nothing to actually download as it is a 0 byte file and Salesforce does not offer anything to download. It is simply a URL as the file's title. The implications of this:

  • You may see errors in the backup summary CSV with the error message Skipping file with 0 bytes. These web links and any other 0 byte files will indeed be skipped. The field information will be captured, but there is no binary to download so that piece is skipped.
  • GRAX does not currently support restoring these 0 byte files.

Files: Archive

Much of the same information above applies whenever you include any File objects within a GRAX Archive. After all, GRAX does still need to back up all the objects first, and then can start deleting the data.

However, with Archives, given that there is a single root/parent object that must be selected, as well as children within the hierarchy where File objects could appear many times, let's take a look at some of the different scenarios.

Select File Objects within the Hierarchy

If you select a standard "non-file" root/parent object such as Account, there are now 2 key objects that show up in the hierarchy: Attachment and ContentDocumentLink. You won't see ContentDocument or ContentVersion in the hierarchy, because ContentDocumentLink in the hierarchy view serves as a proxy for the relevant 'content' objects.

🚧

Note on ContentDocumentLinks

Once GRAX knows all the ContentDocuments that are related to the Parent object, GRAX will additionally back up all related ContentDocumentLinks. Even though some of these ContentDocumentLinks may not related to the parent object selected, GRAX will back them up because when the ContentDocument is deleted Salesforce will automatically delete all related ContentDocumentLinks.

Note: this may not be true in all GRAX versions, so if this is an important use case for you please confirm with GRAX Support.

Select ContentDocument as Root

Another option you have is to select ContentDocument as the root element if you are only interested in archiving Content and children, rather than an object such as Account along with children (that includes many objects).

🚧

GRAX recommends always using ContentDocument as the root element, even if your intention is to only archive ContentVersion, due to the complex relationship between these objects. GRAX will only backup the most recent ContentVersion, so be aware that you can lose previous versions.

Select Attachment as Root

If you select Attachment as the root object in a hierarchy process, you will notice that the hierarchy view is a bit different. The Attachment object doesn't have any children as you probably know, but GRAX will expose a list of objects to allow for more effective querying. So you could query all attachments that are linked to any of the objects that you check off. This can be a flexible tool, and would allow you to run an archive saying something like "give me all attachments modified in the past month that are linked to Accounts or Cases".

Files: Restore

Let's take a look at some different ways to restore Attachments and Content. Content, behind the scenes, represent a much more complex set of interconnected objects so you will need to ensure this is done by an Admin who understand the relationships and business use case.

Attachment Restore

Attachments can be restored similar to other objects, and is simpler given that each Attachment can only relate to a single parent record.

File Restore

Okay, so attachments sound straightforward. Let’s try the same thing but for the files object. In the developer console, search for the Files or Content object. You’ll notice that it doesn’t exist. The files/content object is a virtualized object - it is created in the user interface on demand when the user needs it. But the data must exist somewhere - let’s look at the data structure that the files object references. There are 3 core objects:

  • ContentDocument - This is the core object that mimics the attachment object but it only stores the metadata about the document object
  • ContentDocumentLink - This object stores the link between the Content Document object itself and the Salesforce Record it is attached to
  • ContentVersion - This object stores the Base64 Content. Files can be updated, so you can store multiple versions of a document and retrieve the correct version while maintaining the history. This is very useful for documents you share with partners or customers such as Customer Support Resources or Marketing Assets. By default GRAX will back up the latest version.

After inspecting the objects above we can infer the following data model:

Restore via Search For Parent Record

Given the complex relationship structure, the easiest way to restore a "file" is to search for the parent record and ensure you are restoring children along with it. So I could search for a specific Account and restore children, ensuring that everything just works and the ContentVersion, ContentDocumentLink, and ContentDocument are created.

📘

Salesforce automatically creates the ContentDocument

You cannot manually create a ContentDocument via API, Salesforce will create it automatically. Thus, in certain scenarios, you may notice an error in the restore logs for ContentDocument object. As long as the ContentVersion and ContentDocumentLink succeeded and everything is linked to the parent record, you can rest assured the file is back in Salesforce.

Restore via Search for File

You can search for a particular ContentDocument or ContentVersion as well.

If you restore using the option to restore children, then the general sequence of events for the restore will be:

  • Restore ContentVersion
  • Salesforce auto-creates the ContentDocument
  • Create relevant ContentDocumentLinks (could point to users or other objects)
  • If one of the ContentDocumentLinks pointed to the File along with a Case, for example, attempt to insert the Case, and then we can create the ContentDocumentLink that links the Case with the ContentDocument.

🚧

Fields on ContentVersion

The only field GRAX actively restores on ContentVersion is Title. Most other fields are auto-generated by Salesforce such as CreatedDate, LastModifiedDate, FileType etc. This means the Description field and any custom fields on ContentVersion are not restored.

Restore via Search for ContentDocumentLink

If you know a specific ContentDocumentLink you can individually restore this as well via the Search tab, but note this is not directly related to the ContentVersion, so we don't recommend this unless you already have the ContentVersion restored.

Restore via Lightning Component

We recommend using the GRAX Lightning Connect if you want to visualize ContentVersions that are related to a particular parent record. Even though this relationship is indirect, GRAX does the heavy lifting and will still allow you to view related "files" through the lightning component. When you restore, the algorithm will trace back through the relationships and create the ContentVersion and the ContentDocumentLink for the specific record you are viewing, so that the effect is that you have restored a file and related it to the current record.

Preview Attachments or Content Documents

  1. Click on GRAX tab, followed by the Search subtab.
  2. Select either the Attachment or ContentVersion object
  3. Optionally enter filter criteria and then click Retrieve
  4. You will notice a file preview icon in the action bar that will allow you to preview the record.

🚧

Content Type Warning

If your Attachment does not have a Content Type set, the GRAX preview functionality within the browser will not work (you would just need to download the attachment to view). This Salesforce help article will explain why Content Type may not get set properly by the browser.

You can add the Content Type field to your GRAX view to double check.

Notice the preview icon that you will see if you are searching on Attachments or ContentVersions.Notice the preview icon that you will see if you are searching on Attachments or ContentVersions.

Notice the preview icon that you will see if you are searching on Attachments or ContentVersions.

📘

Only the Following File Types Can be Previewed

PNG: image/png
PDF: application/pdf
GIF: image/gif
TEXT: text/plain
JSON: text/json

You will need to download other file types to view.