Generating a Dataset
In Fides, a dataset is a YAML configuration file that describes a collection of data such as a database. Fides uses these dataset YAML files as a map of your database to automatically process privacy requests. A dataset describes where categories of personal data (e.g. user contact info) can be found and how fields in tables or collections are related to eachother so that Fides can safely traverse the data when processing privacy requests.
Dataset YAML files can be configured locally and then uploaded to Fides or using the UI-based Dataset Editor.
However, we recommend preparing the annotations and information that will be required to support privacy requests ahead of time. Please review the following to prepare:
- Annotating datasets with data categories and key relationships
- Adding the information necessary to process privacy requests
Uploading a dataset
To upload a dataset configuration file that has been manually created:
- Navigate to Data map → Manage datasets.
- Click Create new dataset.
- Click Upload a new dataset YAML.
- Paste the dataset configuration in YAML into the editor.
- Click Create dataset

Generating a datset in the UI
Creating a dataset
To create a new dataset using the Dataset Editor:
- Navigate to Data map → Manage datasets.
- Click Create new dataset.
- Enter the dataset configuration into the editor.
- Click Create dataset

Generating a dataset from a database
To generate a new dataset by connecting to a database:
- Navigate to Data map → Manage datasets.
- Click Create new dataset.
- Click Connect to a database.
- Paste the database connection string into the
Database URL
field. - Click Generate dataset
For helping building the database connection string, please see the SQLAlchemy documentation (opens in a new tab).
PostgreSQL example:
postgresql://<user>:<password>@<hostname>:<port>/<database>

For more details and examples, please see our guide for Generating resources.