How Does CS:Govern Work?
Conceptually, CS:Govern’s implementation of Transparent Data Encryption (TDE) is simple:
- CS:Govern dynamically generates database code to enforce and implement TDE based on protection rules provided by a CS:Govern administrative user.
- CS:Govern generated database code is applied as database triggers to prevent bad-actors from bypassing TDE rules.
Though conceptually simple, the details of CS:Govern’s internals are anything but simple. This article is intended to help a technical person understand the high level aspects of CS:Govern without delving into internal complexities.
An important point to be made is that, throughout a field value’s lifecycle, AT NO POINT is that value stored in a CopyStorm table in an unencrypted state. If the CS:Govern field rules include that field, it’s value will NEVER be exposed in human readable form in the database. This includes insert and update of field values, as well as the execution of CS:Govern Custodians.
CS:Govern on a Napkin
CS:Govern started out as a white board discussion whose initial scope was proposed as a one-page diagram — CS:Govern on a Napkin. The following “napkin drawing” is a copy of the original and still accurately represents CS:Govern’s purpose and design.
CS:Govern Protection Rules
Since CS:Govern generates database protection code based on user defined rules, a big part of the CS:Govern is the protection rule configuration system.
CS:Govern contains the following rule configuration components:
Protection Component | Description |
---|---|
Access Category | An Access Category is a classification that can be applied to a field and is also known as a Compliance Category. Example Access Categories include: PII, HIPAA, etc. In short, an Access Category determines the type of access security to apply to a field. Though CS:Govern includes a variety of built-in access categories, a customer may add any number of additional categories. |
Masking Rule | A Masking Rule is a rule for transforming original field data into a masked value which will be shown to database users without privileges to see the original data. Though CS:Govern provides a rich set of masking rules, it is possible to add additional rules. |
Field Registry | The Field Registry associates Access Categories and a Masking Rule with Salesforce fields. When CS:Govern generates database code to protect a field, this is the primary registry it queries.
Example:
|
Table Rule Registry | The Table Rule Registry is used to track table level database components generated by CS:Govern for a table. |
Key Registry | The Key Registry is CS:Govern’s built-in storage for encryption keys. It is the ONLY CS:Govern component where access should be highly restricted. In practice, no one but the CS:Govern administrator account needs access to this registry, |
Custodian Registry | The Custodian Registry contains a history of CS:Govern Custodian tasks launched on the database. |
Protection Rules ERD
The following ERD shows the primary CS:Govern tables used to define data protection rules. It is important to note:
- None of these tables need to be accessible by regular database users.
- The actual implementation varies slightly be target database type.
The key tables and their purpose are listed in the following table:
Key Table | Description |
---|---|
GuardianField | This table contains a record for each field which is protected by CS:Govern. The companion linking table, GuardianFieldAccessCategory, assigns access categories to a field. It is the intersection of a field’s access categories with the user’s access categories that determines if a user can see the unencrypted value in a field. |
GuardianUser | This table contains a record for each user and database role which has rights to access field data protected by CS:Govern . The companion linking table, GuardianUserAccessTable, assigns the permitted access categories for a user or role. |
GuardianTable | This table is used to keep track for CS:Govern generated database code for a CopyStorm table.
|
GuardianKey | The GuardianKey table is the default storage location for the keys used to encrypt/decrypt data. It is also the ONLY table to which normal database users should be denied access. All CS:Govern generated database code gains access to this table via defined rights declared on the generated code. |
Encrypted Data Storage
Though the technique for encrypted data storage varies slightly for each CS:Govern database type, the general approach is outlined in this section.
When CS:Govern is first configured to protect fields on a CopyStorm table several operations happen.
- A table is created with a name in the format guardTableName. The purpose of this table is to store encrypted data and it is not accessible to normal database users.
- Example: If Contact fields are to be protected, then a table named guardContact will be created in the target database.
- Triggers are installed on the original CopyStorm table. These triggers intercept table insert/update/delete operations to enforce TDE.
- A CS:Govern Custodian is created to encrypt existing data in the CopyStorm table.
When CS:Govern rules are updated, the process is similar except that existing encryption related database elements are updated rather than created.
The table used to hold encrypted data has a fairly simple format illustrated by this example for a Contact table (from a SQL/Server implementation)
Field | Description |
---|---|
id | Salesforce Id for the corresponding record in the CopyStorm Contact table. |
keyId | Id of the CS:Govern keys used to encrypt the record. Note that this column has been deprecated but remains in the schema for backward compatibility purposes. |
Name | Encrypted value of the Contact.Name field. If null, then the Contact.Name field is null or the Contact.Name field has not yet been encrypted. |
keyId1 | Id of the CS:Govern keys used to encrypt the Contact.Name field. |
See Name. | |
keyId2 | Id of the CS:Govern keys used to encrypt the Contact.Email field. |
FirstName | See Name. |
keyId3 | Id of the CS:Govern keys used to encrypt the Contact.FirstName field. |
LastName | See Name. |
keyId4 | Id of the CS:Govern keys used to encrypt the Contact.LastName field. |
Unencrypted Data Access and Masking
When a field is under CS:Govern protection, the original CopyStorm field stores a masked value while the original field is stored in a encryption table. This section explains how to retrieve unencrypted data.
When a CS:Govern table contains protected fields, a generally accessible function named guard<>_GET.
- Example: If the Contact table contains protected fields, then a function name guardContact_GET() is created and given public access.
Each generated _GET function has the form:
- String guardContact_GET( Id String, fieldName String, defaultValue String )
where:
- Id = unique Id of a Salesforce record in the Contact table.
- fieldName = name of a Contact field protected by CS:Govern (case insensitive).
- defaultValue = value to return if the caller does not have permission to read the field decrypted (or if the field is not under CS:Govern protection).
Example: To read a decrypted Contact.Email address SQL code like the following is required:
- SELECT Id, guardContact_GET( Id, ‘Email’, email) as email FROM Contact WHERE …
A decryption function works by first determining if the caller has permission to read decrypted values by computing the intersection of the compliance categories of the user with the compliance categories of the associated field. If the result is not empty then the user has permission to see the field descrypted.
If the current user has permission to read decrypted data and the field is encrypted then the function simply decrypts the stored data and returns it. If the user does not have permission, then the default value is returned.
Does this sound simple? High level descriptions are deceptive since this process is the hardest part of CS:Govern to do in an efficient and accurate manner.
Custodians — Processes to Clean Up Data
When CS:Govern rules change, long running changes to the CopyStorm database can be required. Custodians are lightweight processes designed for long running tasks which will run safely in parallel with CS:Govern protection. There are three types of Custodians:
- EncryptFields: as the name suggests this Custodian will apply masking rules to the selected fields for all records in the field’s database table. It will also enter encrypted values into the associated guardTableName table (known as the “storage” table.) This action is not performed on field values that have already been encrypted.
- DecryptFields: this Custodian performs the inverse operation of the EncryptFields Custodian in the sense that it will decrypt the selected fields and restore the decrypted field values back into the CopyStorm database. Field values in the guardTableName table are nulled.
- RemaskFields: this Custodian, as the name suggests, will change the mask value in the CopyStorm database table for the selected fields and for all records in that table.
Example: Suppose the field Contact.Email has been added to CS:Govern AND there are already one million unencrypted Contact.Email records. In this case:
- CS:Govern will install database code which will protect all future updates to Contact.Email records.
- CS:Govern will create a Custodian whose job is protect existing Contact.Email records.
Though Custodians run outside of the database, their activity is logged in CS:Govern tables.
The primary purpose for the Custodian tables is to report on the current progress and history of custodian tasks.
Note that the recommended way to run Custodians is via the Custodian Runner command line tool.