During the initial backup of Salesforce instances which include large numbers of old Tasks or Events, CopyStorm may experience repeated Salesforce timeout issues when running SOQL queries on these tables. This article explains approaches for solving these problems.
Once the initial backup of the Task and/or Event tables is complete, the Salesforce timeout issue goes away and the recommendations on this page are no longer needed.
Note that these recommendations apply to any Salesforce table, but this issue is rarely seen elsewhere.
Within the Salesforce infrastructure there are multiple levels of data storage — within this section we will refer to them as Tier-1, Tier-2, and Tier-3. Frequently used data (and most tables) tend to be kept in fast Tier-1 storage. Data in some tables (like Task) are migrated to slower storage as they age and are not queried by a user. Eventually most old Task records will be migrated into the slowest storage tier, Tier-3, and become fairly time consuming to query. When data has been migrated to Tier-3 storage, the only way to cause it to move back go faster storage is to attempt to read the Tier-3 data. When old data has been referenced enough times it will be “promoted” into faster storage — even old Task records can migrate from slow Tier-3 storage back to Tier-1 storage if they are referenced frequently.
When CopyStorm first attempts to backup the Task table it runs a query like:
SELECT * FROM Task ORDER BY SystemModStamp ASC LIMIT 20000
If a large number of Tasks are in Tier-3 storage then this query will often timeout — sometimes it will timeout multiple times as CopyStorm tries to re-run the query (by default up to 5 times).
The good news is that this problem tends to go away after the initial backup, because the Tasks that need to be backed up going forward are generally still in fast storage.
Note that each of the possible solutions on this page are one-time tasks that rarely have to be repeated.
If you ask Salesforce, they can increase the amount of time a query can run before timing out. Once Salesforce makes the change, update the “Max Timeout (seconds)” option in the “Advanced” popup on CopyStorm’s Main tab before restarting your backup job.
This solution increases the number of times CopyStorm will attempt to re-run a query if it times out. This usually works because repeatedly running the same query tends to force more data to migrate to faster storage tiers. This parameter is called “Reconnect Delay” and is located in the Salesforce Network configuration section of the CopyStorm Configuration tab.
- Set the CopyStorm “Reconnect Delay” to a large value (20 or greater).
- Select just the Task table.
- Run CopyStorm.
This solution requires knowing the date of the earliest Task in your Salesforce instance. Once this date is known then use the “Modified Since” and “Modified Thru” CopyStorm options to backup a selected slice of Tasks.
For example, if the earliest Task has a SystemModStamp of 1-Feb-2000:
- Go to the Configuration tab.
- Select the Maintenance configuration section.
- Set Modified Since to 2000-02-01T00:00:00.000Z
- Set Modified Thru to 2001-02-01T00:00:00.000Z
- Select just the Task table.
- Run CopyStorm. It will only backup Tasks from the year 2000.
- If the query still times out, then decrease the timebox.
- Repeat the first four steps for the next timebox.
This solution performs a backup of Tasks based on the sort order of Task.Id rather than Task.SystemModStamp.
Here is the process when applied to the Task table (but the same process will work for any table):
- Truncate the Task table.
- Select only the Task table on the Configuration tab “Select Tables to Copy” section.
- You can also type just “Task” into the input box rather than select it from the popup.
- In the Maintenance configuration section of the Configuration tab, check the “Ignore Timestamps” option.
- If you do not see this option then make sure that the “Advanced” Configuration Level is selected.
- Run CopyStorm. This will backup records based on their unique Salesforce Id.
- If Salesforce or the database fails during this procedure restart CopyStorm — it will continue backing up by Salesforce Id starting from where it last stopped.
- Perform the following steps to get the CopyStorm database into a state where incremental timestamp-based updates will work going forward:
- Uncheck the “Ignore Timestamps” option. This will cause CopyStorm to perform subsequent backups based on record timestamps.
- Set the “Modified Since” parameter to LAST_WEEK (or any timestamp earlier than when you started this solution).
- Run CopyStorm. This will back up all Task modifications that occurred while performing the Id-based backup.
- Clear the “Modified Since” parameter.
After these steps are completed, CopyStorm will be able to perform subsequent backups using Salesforce timestamp data (e.g. CopyStorm backups will be incremental).