Fides now supports e2e data subject rights fulfillment, free & open-source. 🚀

Getting Started With Fides — Step 2: Creating Privacy Policies As Code

Privacy-as-Code is a means of codifying privacy policies in the codebase. Using an example policy on data collection, here's how to start creating policies in Fides.



Following our recent blog post on annotating Datasets and Systems in Fides, we take the next step in building Privacy-as-Code. Here, we walk through the process of codifying privacy policies for the purpose of being used in automated compliance checks. In doing so, your team identifies and roots out noncompliant code before it’s ever shipped.

Anatomy of a Policy

In Fides, a policy is a collection of rules. Each rule can be thought of intuitively as: “For Specific Condition X, Perform Specific Action Y.” In this blog post, we’ll use the following policy example. Suppose that we want to build a proactive check into the CI pipeline to confirm that all shipped code complies with this policy: Users’ contact information cannot be collected for the purpose of marketing.

In this blog post, we’ll build an example policy on marketing-related collection of contact information. Along the way, we’ll get familiar with the necessary components of a Fides policy. As we had discussed in the previous post, embedding these policy checks in CI offer substantial savings in time, money, labor, and risk when contrasted with a reactive approach.

Naming and Describing a Policy

In Fides, rules are codified within a YAML file by a handful of straightforward components. First, a fides_key uniquely identifies the rule. In this case, we use reject_direct_marketing as the value for fides_key.

Next, we add a human-friendly name and description for the rule. We choose “Reject Direct Marketing” as the rule’s name. As for the description, we give a human-readable summary of the policy: “Disallow collecting any user contact info for marketing.”

Descriptions of personal data processing in Fides use four basic attributes: data categories, data uses, data subjects, and data qualifiers.

Privacy Primitives

From here, we describe the four privacy primitives, which you might recall from the annotation process:

  • data_categories
  • data_uses
  • data_subjects
  • data_qualifier

We use terms from the Fides privacy taxonomy to add values for each primitive.

For data_categories, we wish to describe the specific types of sensitive data. When we look back to the policy we aim to enforce in CI, the scope of the policy encompasses any contact information gathered from the user, so we add the following value:

For data_uses, we give a formal label to the categories of data processing in the organization. The use case under consideration for this policy is advertising, so we add it accordingly: advertising.

For data_subjects, we define the individual persons whose data the rule pertains to. In this policy, it is customers’ data that we are concerned with: customer is the appropriate value.

And for data_qualifier, we indicate the acceptable or non-acceptable level of de-identification for this data. The data in question, user-provided contact information, directly identifies an individual, so we add the following value: aggregated.anonymized.unlinked_pseudonymized.

Inclusion Criteria

Using Fides, we have the power to further refine the semantics for policy enforcement. Inclusion criteria are basic logic gates on what kinds of data categories, use cases, subjects, and qualifiers should be considered when running automated privacy checks in CI. In particular, the inclusion criteria are:

  • ANY
  • ALL
  • NONE

When specifying values for each of the four privacy primitives, an inclusion criterion is included to indicate whether the given rule should be applied to code with ANY, ALL, or NONE of the values entered.

Our example policy only provides one value for each privacy primitive, so the distinction between ANY and ALL might look trivial. However, let’s suppose for a moment that we wanted to create another rule that prevented the processing of any contact information or gender for marketing purposes. Then the choice between ANY and ALL has real consequences for permissible code in the automated CI check. While choosing ANY would catch instances of processing contact information and/or gender, ALL would only catch instances in which both contact information and gender are processed.


We have now formalized, in detail, the kind of data that falls under the scope of this processing: user-provided contact information for the purposes of marketing. Next comes the action we want from the automated privacy review.

To begin, note that we have framed our policy negatively. That is, we have defined what we don’t want in our shipped code: collection of customers’ contact information for advertising purposes. So if our codebase demonstrates that undesired behavior, we should reject it, so we add REJECT.

A Full-Fledged YAML Policy

Using our basic example, we have all of the pieces needed for our policy manifest.

Let’s look at one more policy. This one demonstrates multiple values for privacy primitives, so the choice of inclusion criteria—ANY versus ALL—is not a trivial one. Codifying this policy might look daunting, but it can be summarized in just two plain-language statements. First, the policy prohibits the usage of identifiable data for any purposes besides to provide the app’s basic functions.

Second, the policy prohibits any collection of sensitive data, for any purpose.

We’ll revisit this policy in the next blog post, where we will execute a policy evaluation in CI.


As with resource annotations in Fides, policies must be kept up-to-date with in-house privacy policies as well as relevant regulations that affect your company. By embedding Fides policy reviews into your team’s development processes, you maintain an accurate and powerful method of enforcing privacy compliance in the CI pipeline, before code ever handles PII out in the wild.

For the next and final installment in this three-part series, we dive into policy evaluation.

Learn More and Get Involved

Explore the rest of this three-part blog series to get acquainted with Fides:

To dive deeper into the Fides ecosystem and connect with the Fides open-source community, check out these resources:

Our team at Ethyca attended the PEPR 2022 Conference in Santa Monica live and virtually between June 23rd and 24th. We compiled three main takeaways after listening to so many great presentations about the current state of privacy engineering, and how the field will change in the future.
Masking data is an essential part of modern privacy engineering. We highlight a handful of masking strategies made possible with the Fides open-source platform, and we explain the difference between key terms: pseudonymization and anonymization.

Ready to get started?

Our team of data privacy devotees would love to show you how Ethyca helps engineers deploy CCPA, GDPR, and LGPD privacy compliance deep into business systems. Let’s chat!