Dataset annotation reference
This page serves as a comprehensive reference for all dataset annotations available in Fides. Dataset annotations are defined in YAML and help describe your data architecture for privacy request processing.
Dataset properties
| Property | Type | Required | Description |
|---|---|---|---|
| fides_key | string | Yes | A unique identifier for the dataset |
| organization_fides_key | string | No | The Fides key of the organization that owns the dataset |
| name | string | No | A human-readable name for the dataset |
| description | string | No | A description of what the dataset represents |
| data_categories | sequence | No | Array of FidesLang data categories that apply to this dataset |
| collections | sequence | Yes | Array of collections/tables within the dataset |
| tags | sequence | No | Array of tags for the dataset |
| fides_meta | YAML collection | No | Additional metadata about the dataset |
- fides_key: {{fides_key}}
organization_fides_key: {{organization_fides_key}}
tags: {{tags}}
name: {{dataset_name}}
description: {{dataset_description}}
data_categories:
- {{data_category}}
{{data_category}}
fides_meta: ...Dataset fides_meta properties
| Property | Type | Required | Description |
|---|---|---|---|
| resource_id | string | No | The resource ID of the dataset |
| after | sequence | No | A list of collections that should be processed before this dataset |
| namespace | YAML collection | No | Namespace configuration for the dataset |
| namespace.dataset_id | string | No | Dataset identifier for the namespace |
| namespace.project_id | string | No | Project identifier for the namespace |
| namespace.connection_type | string | No | Integration type for the namespace (e.g. "bigquery") |
fides_meta:
resource_id: {{resource_id}}
after:
- {{collection_id}}
namespace:
dataset_id: {{dataset_id}}
project_id: {{project_id}}
connection_type: {{connection_type}}Collection properties
| Property | Type | Required | Description |
|---|---|---|---|
| name | string | Yes | Name of the collection/table |
| description | string | No | Description of the collection's purpose |
| data_categories | sequence | No | Array of FidesLang data categories that apply to this collection |
| fields | sequence | Yes | Array of fields within the collection |
| fides_meta | YAML collection | No | Additional metadata about the collection |
collections:
- name: {{collection_name}}
description: {{collection_description}}
data_categories:
- {{data_category}}
{{data_category}}
fields: ...
fides_meta: ...Collection fides_meta properties
| Property | Type | Required | Description |
|---|---|---|---|
| skip_processing | boolean | No | If true, this collection will be skipped during privacy request processing |
| after | sequence | No | A list of collections that should be processed before this collection |
| erase_after | sequence | No | A list of collections that should process erasures before this collection |
| masking_strategy_override | YAML collection | No | The masking strategy to use for the collection |
| masking_strategy_override.strategy | string | No | The masking strategy to use for the collection. Valid values are "delete" or "mask" |
| partitioning | YAML collection | No | The partitioning strategy to use for the collection |
| partitioning.where_clauses | sequence | No | A list of where clauses to use for the collection |
fides_meta:
skip_processing: true|false
after:
- {{collection_id}}
{{collection_id}}
erase_after:
- {{collection_id}}
{{collection_id}}
masking_strategy_override:
strategy: mask|delete
partitioning:
where_clauses:
- {{where_clause}}Field properties
| Property | Type | Required | Description |
|---|---|---|---|
| name | string | Yes | Name of the field |
| description | string | No | Description of the field's contents |
| data_categories | sequence | No | FidesLang data categories that apply to this field |
| fides_meta | YAML collection | No | Additional metadata used by Fides for privacy operations |
| fields | sequence | No | For JSON datasets, nested data are represented as fields of fields |
fields:
- name: {{field_name}}
description: {{field_description}}
data_categories:
- {{data_category}}
{{data_category}}
fields: ...Field references properties
| Property | Type | Required | Description |
|---|---|---|---|
| dataset | string | Yes | The name of the dataset that contains the referenced collection-field relationship |
| field | string | Yes | The name of the field in the referenced collection/table |
| direction | string | No | The direction of the relationship (e.g. "to", "from") |
references:
- dataset: {{dataset_id}}
field: {{collection_id}}.{{field_id}}
direction: {{direction}}Field fides_meta properties
| Property | Type | Required | Description |
|---|---|---|---|
| identity | string | At least 1 per privacy request pipeline | Specify the field that should be used as the identity key |
| references | sequence | No | References to other fields for joins |
| primary_key | boolean | No | If true, indicates this field is a primary key |
| data_type | string | No | The data type of the field (e.g. "string", "integer") |
| length | integer | No | Maximum length for string/text fields |
| read_only | boolean | No | If true, field cannot be modified |
| return_all_elements | boolean | No | If true, field will return all elements in a collection |
| custom_request_field | string | No | The custom field in a privacy request used to associate with this field |
| automated_processing | boolean | No | If true, field is used in automated decision making |
fides_meta:
identity: {{field_name}}
references: ...
primary_key: true|false
data_type: {{data_type}}
length: {{length}}
read_only: true|false
return_all_elements: true|false
custom_request_field: {{custom_request_field}}
automated_processing: true|false