Command: generate
The fides generate
command is a multipurpose tool used to connect to a specified database/service and automatically generate Fides resources in the Fides YAML format and style. This command generates a YAML resource file based on the data source's schema. By default the documents generated by this command are written to the .fides
working directory.
Usage
Usage: fides generate <commands> [options] path/to/destination.yml
This command accepts several subcommands depending on the resource being generated and the data source type. For example:
Generating a dataset accepts either db
(database) or gcp
(Google Cloud Platform) as the type, followed by connection information and the destination to write the generated dataset.
fides generate dataset [db || gcp] {connection_information} .fides/destination.yaml
Generating a system accepts either aws
(Amazon Web Services) or okta
(Okta Sign-on) as the type, followed by connection information and the destination to write the generated dataset.
fides generate system [aws || okta] {connection_information} .fides/destination.yaml
Here are the full list of accepted commands, subcommands and arguments:
dataset
- Create a dataset resource from the connected data source, such as a database (e.g. SQL).db
- Connect to a database directly via a SQLAlchemy-style connection. Accepted options for this command:--credentials-id
- Use connection information already defined in thefides.toml
config by specifying the key for the associated credentials in the config file.--connection-string
- Use a connection string to connect to a database.--include-null
- Include attributes in the dataset YAML that would otherwise be null in the schema.
gcp
- Connect to a Google Cloud Platform data source.bigquery
- Connect to a BigQuery dataset directly via a SQLAlchemy.--credentials-id
- Use connection information already defined in thefides.toml
config by specifying the key for the associated credentials in the config file.--keyfile-path
- Path to load BigQuery credentials from a file on your local machine.--include-null
- Include attributes in the dataset YAML that would otherwise be null in the schema.
system
- Create system resources from the connected source, such as scanning infrastructure (e.g. AWS) or sign-on systems (e.g. Okta).aws
- Connect to an AWS account and generate a system YAML file. Accepted options for this command:--credentials-id
- Use AWS connection information already defined in thefides.toml
config by specifying the key for the associated credentials in the config file.--access_key_id
- Specify an access key id to connect to AWS. Connecting to AWS requires the options:--access_key_id
,--secret_access_key
and--region
.--secret_access_key
- Specify the secret access key for connecting to AWS. Connecting to AWS requires the options:--access_key_id
,--secret_access_key
and--region
.--region
- Specify the region to connect to for scanning AWS. Connecting to AWS requires the options:--access_key_id
,--secret_access_key
and--region
.--include-null
- Include attributes in the system YAML that would otherwise be null for this system.
okta
- Connect to an Okta instance and generate a system YAMl file.--credentials-id
- Use Okta connection information already defined in thefides.toml
config by specifying the key for the associated credentials in the config file.--org-url
- Specify the organization's Okta URL to connect to Okta. Connecting to Okta requires the options:--org-url
and--token
.--token
- Specify the token to connect to Okta. Connecting to Okta requires the options:--org-url
and--token
.--include-null
- Include attributes in the system YAML that would otherwise be null for this system.
Running this command should result in output that resembles the following examples.
Examples: Generating Datasets
Datasets are the typically the most valuable resources for developers working regularly with Fides. The generate dataset
command allows you to quickly create a Fides formatted dataset from a given database to make it easy to identify sensitive data in your systems. For more in-depth details on datasets, check out the Fides Datasets tutorial.
Example: Generate a Dataset from a database using a connection config file
This example uses connection details stored in the fides.toml
to connect to a database and generate a dataset at the location .fides/database.yml
.
In this example, pg-credentials
is the key that represents the credentials in the configuration file. For more in-depth details on using configuration files, check out the Fides Configuration tutorial.
$ fides generate dataset db \
--credentials-id "pg-credentials" .fides/database.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/database.yml
Example: Generate a Dataset from a database using a connection string
This example connects to the database and generates a dataset at the location .fides/database.yml
$ fides generate dataset db \
--connection-string postgresql://username:password@localhost:5432/database .fides/database.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/database.yml
Example: Generate a Dataset from BigQuery using a connection config file
This example uses connection details stored in the fides.toml
to connect to a BigQuery data warehouse and generate a dataset at the location .fides/datawarehouse.yml
.
In this example, bq-credentials
is the key that represents the credentials in the configuration file. For more in-depth details on using configuration files, check out the Fides Configuration tutorial.
$ fides generate dataset gcp bigquery \
--credentials-id "bq-credentials" .fides/datawarehouse.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/datawarehouse.yml
Example: Generate a Dataset from BigQuery using a service account key file
This example uses BigQuery service account credentials to connect to a BigQuery data warehouse and generate a dataset at the location .fides/datawarehouse.yml
. For more information on BigQuery service account key files, visit Google Cloud's Service Account guide.
$ fides generate dataset gcp bigquery \
--kefile-path "path/to/service_account.json" .fides/datawarehouse.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/datawarehouse.yml
Example: Include null attributes when generating a dataset
This example uses the --include-null
flag so that the generated dataset includes all null attributes.
$ fides generate dataset db \
--credentials-id "pg-credentials" \
--include-null .fides/database.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/database.yml
Examples: Generating System YAML
System YAMLs identify and describe the various systems that data flows through in an organization. The generate system
command enables you to rapidly create a system resources YAML file by scanning infrastructure and sign-on providers. For more in-depth details on systems, check out the Fides Systems tutorial.
Example: Generate Systems by scanning AWS using a connection config file
This example uses connection details stored in the fides.toml
to connect to and scan an AWS account, and generate a manifest at the location .fides/systems.yml
of all detected systems.
In this example, aws-credentials
is the key that represents the credentials in the configuration file. To learn more about configuration files, check out the Fides Configuration tutorial.
$ fides generate system aws \
--credentials-id "aws-credentials" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/systems.yml
Example: Generate Systems by scanning AWS using credentials arguments
This example passes connection details as arguments to connect to an AWS account, scan for systems in AWS, and generate a system manifest at the location .fides/systems.yml
.
$ fides generate system aws \
--access_key_id "AWS-ACCESS-KEY" \
--secret_access_key "AWS-SECRET-KEY" \
--region "AWS-REGION" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated system manifest written to .fides/systems.yml
Example: Generate Systems by scanning Okta using a connection config file
This example uses connection details stored in the fides.toml
to connect to an Okta server, scan for systems in Okta, and generate a system manifest at the location .fides/systems.yml
.
In this example, okta-credentials
is the key that represents the credentials in the configuration file. To learn more about configuration files, check out the Fides Configuration tutorial.
$ fides generate system okta \
--credentials-id "okta-credentials" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated system manifest written to .fides/systems.yml
Example: Generate Systems by scanning Okta using credentials arguments
This example passes connection details as arguments to connect to an Okta server, scan for systems in Okta, and generate a system manifest at the location .fides/systems.yml
.
$ fides generate system okta \
--org-url "OKTA-URL" \
--token "OKTA-TOKEN" .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated system manifest written to .fides/systems.yml
Example: Include null attributes when generating a system
This example uses the --include-null
flag so that the generated system includes all null attributes.
$ fides generate system aws \
--credentials-id "aws-credentials" \
--include-null .fides/systems.yml
Loaded config from: .fides/fides.toml
Generated dataset manifest written to .fides/systems.yml