Skip to content
Managing Resources
generate

Command: generate

The fides generate command is a multipurpose tool used to connect to a specified database/service and automatically generate Fides resources in the Fides YAML format and style. This command generates a YAML resource file based on the data source's schema. By default the documents generated by this command are written to the .fides working directory.

Usage

Usage: fides generate <commands> [options] path/to/destination.yml

This command accepts several subcommands depending on the resource being generated and the data source type. For example:

Generating a dataset accepts either db (database) or gcp (Google Cloud Platform) as the type, followed by connection information and the destination to write the generated dataset.

fides generate dataset [db || gcp] {connection_information} .fides/destination.yaml

Generating a system accepts either aws (Amazon Web Services) or okta (Okta Sign-on) as the type, followed by connection information and the destination to write the generated dataset.

fides generate system [aws || okta] {connection_information} .fides/destination.yaml

Here are the full list of accepted commands, subcommands and arguments:

  • dataset - Create a dataset resource from the connected data source, such as a database (e.g. SQL).
    • db - Connect to a database directly via a SQLAlchemy-style connection. Accepted options for this command:
      • --credentials-id - Use connection information already defined in the fides.toml config by specifying the key for the associated credentials in the config file.
      • --connection-string - Use a connection string to connect to a database.
      • --include-null - Include attributes in the dataset YAML that would otherwise be null in the schema.
    • gcp - Connect to a Google Cloud Platform data source.
      • bigquery - Connect to a BigQuery dataset directly via a SQLAlchemy.
        • --credentials-id - Use connection information already defined in the fides.toml config by specifying the key for the associated credentials in the config file.
        • --keyfile-path - Path to load BigQuery credentials from a file on your local machine.
        • --include-null - Include attributes in the dataset YAML that would otherwise be null in the schema.
  • system - Create system resources from the connected source, such as scanning infrastructure (e.g. AWS) or sign-on systems (e.g. Okta).
    • aws - Connect to an AWS account and generate a system YAML file. Accepted options for this command:
      • --credentials-id - Use AWS connection information already defined in the fides.toml config by specifying the key for the associated credentials in the config file.
      • --access_key_id - Specify an access key id to connect to AWS. Connecting to AWS requires the options: --access_key_id,--secret_access_key and --region.
      • --secret_access_key - Specify the secret access key for connecting to AWS. Connecting to AWS requires the options: --access_key_id,--secret_access_key and --region.
      • --region - Specify the region to connect to for scanning AWS. Connecting to AWS requires the options: --access_key_id,--secret_access_key and --region.
      • --include-null - Include attributes in the system YAML that would otherwise be null for this system.
    • okta - Connect to an Okta instance and generate a system YAMl file.
      • --credentials-id - Use Okta connection information already defined in the fides.toml config by specifying the key for the associated credentials in the config file.
      • --org-url - Specify the organization's Okta URL to connect to Okta. Connecting to Okta requires the options: --org-url and --token.
      • --token - Specify the token to connect to Okta. Connecting to Okta requires the options: --org-url and --token.
      • --include-null - Include attributes in the system YAML that would otherwise be null for this system.

Running this command should result in output that resembles the following examples.


Examples: Generating Datasets

Datasets are the typically the most valuable resources for developers working regularly with Fides. The generate dataset command allows you to quickly create a Fides formatted dataset from a given database to make it easy to identify sensitive data in your systems. For more in-depth details on datasets, check out the Fides Datasets tutorial.

Example: Generate a Dataset from a database using a connection config file

This example uses connection details stored in the fides.toml to connect to a database and generate a dataset at the location .fides/database.yml.

In this example, pg-credentials is the key that represents the credentials in the configuration file. For more in-depth details on using configuration files, check out the Fides Configuration tutorial.

$ fides generate dataset db \
--credentials-id "pg-credentials" .fides/database.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/database.yml

Example: Generate a Dataset from a database using a connection string

This example connects to the database and generates a dataset at the location .fides/database.yml

$ fides generate dataset db \
--connection-string postgresql://username:password@localhost:5432/database .fides/database.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/database.yml

Example: Generate a Dataset from BigQuery using a connection config file

This example uses connection details stored in the fides.toml to connect to a BigQuery data warehouse and generate a dataset at the location .fides/datawarehouse.yml.

In this example, bq-credentials is the key that represents the credentials in the configuration file. For more in-depth details on using configuration files, check out the Fides Configuration tutorial.

$ fides generate dataset gcp bigquery \
--credentials-id "bq-credentials" .fides/datawarehouse.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/datawarehouse.yml

Example: Generate a Dataset from BigQuery using a service account key file

This example uses BigQuery service account credentials to connect to a BigQuery data warehouse and generate a dataset at the location .fides/datawarehouse.yml. For more information on BigQuery service account key files, visit Google Cloud's Service Account guide.

$ fides generate dataset gcp bigquery \
--kefile-path "path/to/service_account.json" .fides/datawarehouse.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/datawarehouse.yml

Example: Include null attributes when generating a dataset

This example uses the --include-null flag so that the generated dataset includes all null attributes.

$ fides generate dataset db \
--credentials-id "pg-credentials" \
--include-null .fides/database.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/database.yml

Examples: Generating System YAML

System YAMLs identify and describe the various systems that data flows through in an organization. The generate system command enables you to rapidly create a system resources YAML file by scanning infrastructure and sign-on providers. For more in-depth details on systems, check out the Fides Systems tutorial.

Example: Generate Systems by scanning AWS using a connection config file

This example uses connection details stored in the fides.toml to connect to and scan an AWS account, and generate a manifest at the location .fides/systems.yml of all detected systems.

In this example, aws-credentials is the key that represents the credentials in the configuration file. To learn more about configuration files, check out the Fides Configuration tutorial.

$ fides generate system aws \
--credentials-id "aws-credentials" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/systems.yml

Example: Generate Systems by scanning AWS using credentials arguments

This example passes connection details as arguments to connect to an AWS account, scan for systems in AWS, and generate a system manifest at the location .fides/systems.yml.

$ fides generate system aws \
--access_key_id "AWS-ACCESS-KEY" \
--secret_access_key "AWS-SECRET-KEY" \
--region "AWS-REGION" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated system manifest written to .fides/systems.yml

Example: Generate Systems by scanning Okta using a connection config file

This example uses connection details stored in the fides.toml to connect to an Okta server, scan for systems in Okta, and generate a system manifest at the location .fides/systems.yml.

In this example, okta-credentials is the key that represents the credentials in the configuration file. To learn more about configuration files, check out the Fides Configuration tutorial.

$ fides generate system okta \
--credentials-id "okta-credentials" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated system manifest written to .fides/systems.yml

Example: Generate Systems by scanning Okta using credentials arguments

This example passes connection details as arguments to connect to an Okta server, scan for systems in Okta, and generate a system manifest at the location .fides/systems.yml.

$ fides generate system okta \
--org-url "OKTA-URL" \
--token "OKTA-TOKEN" .fides/systems.yml 
Loaded config from: .fides/fides.toml
Generated system manifest written to .fides/systems.yml

Example: Include null attributes when generating a system

This example uses the --include-null flag so that the generated system includes all null attributes.

$ fides generate system aws \
--credentials-id "aws-credentials" \
--include-null .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/systems.yml