Where Are the Foreign Key Relationships?
The first time a programmer brings a CopyStorm database into ERwin (or another data modeling tool) one of the first questions asked is:
- Where are the relationships between tables?
The “database equivalent” question is:
- Why doesn’t CopyStorm generate FOREIGN KEY constraints?
The short answer to this question is:
The addition of Foreign Key constraints would dramatically increase the complexity and cost of extracting data from Salesforce without improving the performance of database queries.
In a database, the purpose of a Foreign Key constraint is to enforce referential integrity. A Foreign Key constraint does nothing to improve query performance — that is a job for indexes (CopyStorm does create the indexes!). If Foreign Key constraints were enabled on a CopyStorm database then several hugely complex problems start appearing:
- Records MUST be inserted/updated in an order that maintains referential integrity — 100% of the time.
- Bi-directional relationships must be backed up in two passes. This type of relationship occurs frequently in Salesforce.
Let’s look at one simple referential integrity example. Salesforce’s Account table contains a “Parent Account” field. If there are three accounts in Salesforce (A, B, and C), and A is the parent of B and B is the parent of C then the records have to be written to the database in the order:
- A then B then C
Any other order will cause a referential integrity error in the database. Add a few dozen tables and their cross relationships and the complexity of this problem explodes (along with the runtime)!
Let’s look at a simple bi-directional relationship where it is impossible to copy the records in the proper order.
- Each User record in Salesforce references a Profile record.
- Each Profile record in Salesforce references the User table via the Profile.ownerId field.
This is a problem because it makes it impossible to copy the “correct record first” due to the mutual dependency. Any restore approach (if Foreign Key constraints are enabled) would require an approach like:
- Copy the User record, leaving the ProfileId field null.
- Copy the Profile record.
- Update the User record’s ProfileId field with the id of the Profile record restored in the previous step.
Does this sound like a lot of data retrieval and graph theory to make a backup work? In practice the problem is much harder than these examples when the reality of hundreds of tables and their interrelationships are added.
How Does CopyStorm Help?
Though CopyStorm does not declare Foreign Key constraints, it does create indexes on all columns that are involved in foreign key relationships. This means that JOINs will run fast (even though ERwin will not display a nice picture of the relationships).
When you need to know the key relationships there are several good sources:
- Salesforce publishes ERDs of their data models.
- CopyStorm maintains tables of all relationships in several tables included in every CopyStorm backup.