Privacy request annotations
When Fides processes privacy requests, it needs to know how to traverse your datasets to process all data related to the user. This is done by creating references across datasets.
Dataset identity annotation
The dataset identity is the starting point for a privacy request. These identity keys are used to locate identifiers like email or phone numbers within a database.
Expanding on our sample project, the most suitable identity
would be the email
field because it is unique and identifiable.
collections:
- name: customer
fields:
- name: email
fides_meta: # Add Fides metadata section
identity: email # Specify this is the identity key and provide a name
data_type: string # Specify the expected data type
Connecting collections
The fundamental component for connecting collections is the references
key. This key connects fields across datasets and controls the order of processing.
references:
- dataset: {{dataset_key}}
field: {{referenced_field_name}}
direction: from | to
Key | Value |
---|---|
references | Declares a reference in a field. |
dataset | The key of the referenced dataset. |
field | The pointer to a specific field in the referenced dataset. |
direction | from indicates the referenced field must be processed before the current record. to indicates the current record must be processed before the referenced field. |
The field
keyword is only used in references. fields
is used for defining collections.
Examples
The example below demonstrates a reference between customer_id
and id
. The id
field is in the customer
collection of a dataset with the key postgres_example_test_dataset
.
The direction from
indicates that the id
field must be processed before the customer_id
field.
fields:
- name: customer_id
fides_meta:
references:
- dataset: postgres_example_test_dataset
field: customer.id
direction: from
The example below also demonstrates a reference between customer_id
and id
. The difference here is that the customer_id
field sits in a semi-structured datasource, so it is nested within a parent object.
- fides_key: mongo_test
name: Mongo Example Test Dataset
collections:
- name: customer_details
fields:
...
- name: comments
fields:
- name: customer_id
fides_meta:
references:
- dataset: postgres_example_test_dataset
field: customer.id
direction: from
Skipping collections
The ability to skip privacy request processing on specific data collections or API endpoints can be useful in scenarios such as a data processing error, or if the collection is known to not contain personal data.
In order to skip a collection, use the flag skip_processing
as shown in this example:
dataset:
- fides_key: postgres_example_dataset
name: Postgres Example Dataset
description: Example of a Postgres dataset containing a variety of related tables like customers, products, addresses, etc.
collections:
- name: address
fides_meta:
skip_processing: True
In order to skip an endpoint, use the flag skip_processing
as shown in this example:
saas_config:
fides_key: saas_connector_example
name: SaaS Example Config
type: custom
description: A sample schema representing a SaaS for Fides
version: 0.0.1
endpoints:
- name: skipped_collection
skip_processing: True
requests:
read:
method: GET
path: /v1/misc_endpoint/<list_id>
param_values:
- name: list_id
references:
- dataset: saas_connector_example
field: users.list_ids
direction: from
To learn more about advanced configuration options and how Fides traverses databases, please see our guide for Query execution.