How Does CS:Govern Work?

Conceptually, CS:Govern’s implementation of Transparent Data Encryption (TDE) is simple:

  • CS:Govern dynamically generates database code to enforce and implement TDE based on protection rules provided by a CS:Govern administrative user.
  • CS:Govern generated database code is applied as database triggers to prevent bad-actors from bypassing TDE rules.

Though conceptually simple, the details of CS:Govern’s internals are anything but simple. This article is intended to help a technical person understand the high level aspects of CS:Govern without delving into internal complexities.

An important point to be made is that, throughout a field value’s lifecycle, AT NO POINT is that value stored in a CopyStorm table in an unencrypted state.  If the CS:Govern field rules include that field, it’s value will NEVER be exposed in human readable form in the database.  This includes insert and update of field values, as well as the execution of CS:Govern Custodians.

CS:Govern on a Napkin

CS:Govern started out as a white board discussion whose initial scope was proposed as a one-page diagram — CS:Govern on a Napkin. The following “napkin drawing” is a copy of the original and still accurately represents CS:Govern’s purpose and design.

CS:Govern Protection Rules

Since CS:Govern generates database protection code based on user defined rules, a big part of the CS:Govern is the protection rule configuration system.

CS:Govern contains the following rule configuration components:

Protection Component Description
Access Category An Access Category is a classification that can be applied to a field and is also known as a Compliance Category. Example Access Categories include: PII, HIPAA, etc.  In short, an Access Category determines the type of access security to apply to a field. Though CS:Govern includes a variety of built-in access categories, a customer may add any number of additional categories.
Masking Rule A Masking Rule is a rule for transforming original field data into a masked value which will be shown to database users without privileges to see the original data.  Though CS:Govern provides a rich set of masking rules, it is possible to add additional rules.
Field Registry The Field Registry associates Access Categories and a Masking Rule with Salesforce fields. When CS:Govern generates database code to protect a field, this is the primary registry it queries.

Example:

  • Add the Contact.Email field, set its Access Category to PII, and mask it with the built in email masking rule.
Table Rule Registry The Table Rule Registry is used to track table level database components generated by CS:Govern for a table.
Key Registry The Key Registry is CS:Govern’s built-in storage for encryption keys. It is the ONLY CS:Govern component where  access should be highly restricted. In practice, no one but the CS:Govern administrator account needs access to this registry,
Custodian Registry The Custodian Registry contains a history of CS:Govern Custodian tasks launched on the database.

Protection Rules ERD

The following ERD shows the primary CS:Govern tables used to define data protection rules. It is important to note:

  • None of these tables need to be accessible by regular database users.
  • The actual implementation varies slightly be target database type.

The key tables and their purpose are listed in the following table:

Key Table Description
GuardianField This table contains a record for each field which is protected by CS:Govern. The companion linking table, GuardianFieldAccessCategory, assigns access categories to a field. It is the intersection of a field’s access categories with the user’s access categories that determines if a user can see the unencrypted value in a field.
GuardianUser This table contains a record for each user and database role which has rights to access field data protected by CS:Govern . The companion linking table, GuardianUserAccessTable, assigns the permitted access categories for a user or role.
GuardianTable This table is used to keep track for CS:Govern generated database code for a CopyStorm table.

  • storageTableName — the name of the database used to store encrypted data for a corresponding CopyStorm table.
  • insertTriggerName — the name of the database trigger implementing TDE on database inserts on a CopyStorm table.
  • decryptFunctionName — the name of a function which will decrypt data based on the rights of the current database user. Note: This is the ONLY function which needs general execution access.
GuardianKey The GuardianKey table is the default storage location for the keys used to encrypt/decrypt data. It is also the ONLY table to which normal database users should be denied access. All CS:Govern generated database code gains access to this table via defined rights declared on the generated code.

Encrypted Data Storage

Though the technique for encrypted data storage varies slightly for each CS:Govern database type, the general approach is outlined in this section.

When CS:Govern is first configured to protect fields on a CopyStorm table several operations happen.

  • A table is created with a name in the format guardTableName. The purpose of this table is to store encrypted data and it is not accessible to normal database users.
    • Example: If Contact fields are to be protected, then a table named guardContact will be created in the target database.
  • Triggers are installed on the original CopyStorm table. These triggers intercept table insert/update/delete operations to enforce TDE.
  • A CS:Govern Custodian is created to encrypt existing data in the CopyStorm table.

When CS:Govern rules are updated, the process is similar except that existing encryption related database elements are updated rather than created.

The table used to hold encrypted data has a fairly simple format illustrated by this example for a  Contact table (from a SQL/Server implementation)

Copy to Clipboard
Field Description
id Salesforce Id for the corresponding record in the CopyStorm Contact table.
keyId Id of the CS:Govern keys used to encrypt the record. Note that this column has been deprecated but remains in the schema for backward compatibility purposes.
Name Encrypted value of the Contact.Name field.  If null, then the Contact.Name field is null or the Contact.Name field has not yet been encrypted.
keyId1 Id of the CS:Govern keys used to encrypt the Contact.Name field.
Email See Name.
keyId2 Id of the CS:Govern keys used to encrypt the Contact.Email field.
FirstName See Name.
keyId3 Id of the CS:Govern keys used to encrypt the Contact.FirstName field.
LastName See Name.
keyId4 Id of the CS:Govern keys used to encrypt the Contact.LastName field.
Note that each protected field is associated with its own encryption key Id.  CS:Govern records which key was used to encrypt the field’s value at the time that the encryption occurred.  This implies that key Id’s may not be the same across all fields of a record, depending upon the sequence of events of the lifecycle of that record.  In a typical scenario in which encryption keys are never rotated, then those key Id’s will be the same in each record and every field.

Unencrypted Data Access and Masking

When a field is under CS:Govern protection, the original CopyStorm field stores a masked value while the original field is stored in a encryption table. This section explains how to retrieve unencrypted data.

When a CS:Govern table contains protected fields, a generally accessible function named guard<>_GET.

  • Example: If the Contact table contains protected fields, then a function name guardContact_GET() is created and given public access.

Each generated _GET function has the form:

  • String guardContact_GET( Id String, fieldName String, defaultValue String )

where:

  • Id = unique Id of a Salesforce record in the Contact table.
  • fieldName = name of a Contact field protected by CS:Govern (case insensitive).
  • defaultValue = value to return if the caller does not have permission to read the field decrypted (or if the field is not under CS:Govern protection).

Example: To read a decrypted Contact.Email address SQL code like the following is required:

  • SELECT Id, guardContact_GET( Id, ‘Email’, email) as email FROM Contact WHERE …

A decryption function works by first determining if the caller has permission to read decrypted values by computing the intersection of the compliance categories of the user with the compliance categories of the associated field. If the result is not empty then the user has permission to see the field descrypted.

If the current user has permission to read decrypted data and the field is encrypted then the function simply decrypts the stored data and returns it. If the user does not have permission, then the default value is returned.

Does this sound simple? High level descriptions are deceptive since this process is the hardest part of CS:Govern to do in an efficient and accurate manner.

Custodians — Processes to Clean Up Data

When CS:Govern rules change, long running changes to the CopyStorm database can be required. Custodians are lightweight processes designed for long running tasks which will run safely in parallel with CS:Govern protection.  There are three types of Custodians:

  • EncryptFields: as the name suggests this Custodian will apply masking rules to the selected fields for all records in the field’s database table.  It will also enter encrypted values into the associated guardTableName table (known as the “storage” table.)  This action is not performed on field values that have already been encrypted.
  • DecryptFields: this Custodian performs the inverse operation of the EncryptFields Custodian in the sense that it will decrypt the selected fields and restore the decrypted field values back into the CopyStorm database.  Field values in the guardTableName table are nulled.
  • RemaskFields: this Custodian, as the name suggests, will change the mask value in the CopyStorm database table for the selected fields and for all records in that table.

Example: Suppose the field Contact.Email has been added to CS:Govern AND there are already one million unencrypted Contact.Email records. In this case:

  • CS:Govern will install database code which will protect all future updates to Contact.Email records.
  • CS:Govern will create a Custodian whose job is protect existing Contact.Email records.

Though Custodians run outside of the database, their activity is logged in CS:Govern tables.

The primary purpose for the Custodian tables is to report on the current progress and history of custodian tasks.

Note that the recommended way to run Custodians is via the Custodian Runner command line tool.