Dataset annotation reference
This page serves as a comprehensive reference for all dataset annotations available in Fides. Dataset annotations are defined in YAML and help describe your data architecture for privacy request processing.
Dataset properties
Property | Type | Required | Description |
---|---|---|---|
fides_key | string | Yes | A unique identifier for the dataset |
organization_fides_key | string | No | The Fides key of the organization that owns the dataset |
name | string | No | A human-readable name for the dataset |
description | string | No | A description of what the dataset represents |
data_categories | sequence | No | Array of FidesLang data categories that apply to this dataset |
collections | sequence | Yes | Array of collections/tables within the dataset |
tags | sequence | No | Array of tags for the dataset |
fides_meta | YAML collection | No | Additional metadata about the dataset |
- fides_key: {{fides_key}}
organization_fides_key: {{organization_fides_key}}
tags: {{tags}}
name: {{dataset_name}}
description: {{dataset_description}}
data_categories:
- {{data_category}}
{{data_category}}
fides_meta: ...
Dataset fides_meta properties
Property | Type | Required | Description |
---|---|---|---|
resource_id | string | No | The resource ID of the dataset |
after | sequence | No | A list of collections that should be processed before this dataset |
namespace | YAML collection | No | Namespace configuration for the dataset |
namespace.dataset_id | string | No | Dataset identifier for the namespace |
namespace.project_id | string | No | Project identifier for the namespace |
namespace.connection_type | string | No | Integration type for the namespace (e.g. "bigquery") |
fides_meta:
resource_id: {{resource_id}}
after:
- {{collection_id}}
namespace:
dataset_id: {{dataset_id}}
project_id: {{project_id}}
connection_type: {{connection_type}}
Collection properties
Property | Type | Required | Description |
---|---|---|---|
name | string | Yes | Name of the collection/table |
description | string | No | Description of the collection's purpose |
data_categories | sequence | No | Array of FidesLang data categories that apply to this collection |
fields | sequence | Yes | Array of fields within the collection |
fides_meta | YAML collection | No | Additional metadata about the collection |
collections:
- name: {{collection_name}}
description: {{collection_description}}
data_categories:
- {{data_category}}
{{data_category}}
fields: ...
fides_meta: ...
Collection fides_meta properties
Property | Type | Required | Description |
---|---|---|---|
skip_processing | boolean | No | If true, this collection will be skipped during privacy request processing |
after | sequence | No | A list of collections that should be processed before this collection |
erase_after | sequence | No | A list of collections that should process erasures before this collection |
masking_strategy_override | YAML collection | No | The masking strategy to use for the collection |
masking_strategy_override.strategy | string | No | The masking strategy to use for the collection. Valid values are "delete" or "mask" |
partitioning | YAML collection | No | The partitioning strategy to use for the collection |
partitioning.where_clauses | sequence | No | A list of where clauses to use for the collection |
fides_meta:
skip_processing: true|false
after:
- {{collection_id}}
{{collection_id}}
erase_after:
- {{collection_id}}
{{collection_id}}
masking_strategy_override:
strategy: mask|delete
partitioning:
where_clauses:
- {{where_clause}}
Field properties
Property | Type | Required | Description |
---|---|---|---|
name | string | Yes | Name of the field |
description | string | No | Description of the field's contents |
data_categories | sequence | No | FidesLang data categories that apply to this field |
fides_meta | YAML collection | No | Additional metadata used by Fides for privacy operations |
fields | sequence | No | For JSON datasets, nested data are represented as fields of fields |
fields:
- name: {{field_name}}
description: {{field_description}}
data_categories:
- {{data_category}}
{{data_category}}
fields: ...
Field references properties
Property | Type | Required | Description |
---|---|---|---|
dataset | string | Yes | The name of the dataset that contains the referenced collection-field relationship |
field | string | Yes | The name of the field in the referenced collection/table |
direction | string | No | The direction of the relationship (e.g. "to", "from") |
references:
- dataset: {{dataset_id}}
field: {{collection_id}}.{{field_id}}
direction: {{direction}}
Field fides_meta properties
Property | Type | Required | Description |
---|---|---|---|
identity | string | At least 1 per privacy request pipeline | Specify the field that should be used as the identity key |
references | sequence | No | References to other fields for joins |
primary_key | boolean | No | If true, indicates this field is a primary key |
data_type | string | No | The data type of the field (e.g. "string", "integer") |
length | integer | No | Maximum length for string/text fields |
read_only | boolean | No | If true, field cannot be modified |
return_all_elements | boolean | No | If true, field will return all elements in a collection |
custom_request_field | string | No | The custom field in a privacy request used to associate with this field |
automated_processing | boolean | No | If true, field is used in automated decision making |
fides_meta:
identity: {{field_name}}
references: ...
primary_key: true|false
data_type: {{data_type}}
length: {{length}}
read_only: true|false
return_all_elements: true|false
custom_request_field: {{custom_request_field}}
automated_processing: true|false