SaaS Post-Processors

Post-processors are, in essence, data transformers. Given data from an endpoint, we can add specific processors to transform the data into a format we need for privacy requests.

Configuration

Post-processors are configured within the endpoints section of a saas_config:

endpoints:
  - name: messages
    requests:
      read:
        method: GET
        path: /conversations/<id>/messages
        param_values:
          ...
        postprocessors:
          - strategy: unwrap
            configuration:
              data_path: conversation_messages
          - strategy: filter
            configuration:
              field: from_email
              value:
                identity: email

Note: Order matters as it's defined in the config. In the above example, unwrap will be run first, then the output of unwrap will be used in the filter strategy.

Format subsequent requests

Post-processors can format the results of your access requests for use in subsequent update or delete statements.

For example, if we need to return the following in an access request:

{"recipient": "test@email.com",
  "subscriptions": [{
    "id": "123",
    "subscribed": "TRUE"
  }]}

And we needed to perform an update request for each item within subscriptions, where subscribed = TRUE, then we'd need the following config for our update request:

update:
    ...
    data_path: subscriptionStatuses
    postprocessors:
      - strategy: filter
        configuration:
          field: subscribed
          value: TRUE

Supported strategies

unwrap: Gets object at given data path.
filter: Removes data that does not match a given field and value.

Filter

Filters object or array given field name and value. Value can reference a dynamic identity passed in through the request OR be a hard-coded value.

Configuration details

strategy: filter

configuration:

field (str): Corresponds to the field on which to filter. For example, we wish to filter where email_contact == "bob@mail.com", then field will be email_contact.
value (str): Value to search for when filtering (e.g. hard-coded bob@mail.com) or Dict of identity path:
- identity (str): Identity object from privacy request (e.g. email or phone_number)
- dataset_reference (str): A dataset reference in the format dataset.collection.field (e.g: fides_instance.customer.id)
exact (optional bool defaults to True): value and field value must be the same length (no extra characters).
case_sensitive (optional bool defaults to True): Cases must match between value and field value.

Examples

Post-Processor Config for identity value:

- strategy: filter
  configuration:
    field: email_contact
    value:
      identity: email

Identity data passed in through request:

{
  "email": "somebody@email.com"
}

Data to be processed:

[
    {
        "id": 1397429347,
        "email_contact": "somebody@email.com",
        "name": "Somebody Awesome"
    },
    {
        "id": 238475234,
        "email_contact": "somebody-else@email.com",
        "name": "Somebody Cool"
    }
]

Result:

[
    {
        "id": 1397429347,
        "email_contact": "somebody@email.com",
        "name": "Somebody Awesome"
    }
]

By default, this filter is exact and case-sensitive.

Post-Processor Config:

- strategy: filter
  configuration:
    field: email_contact
    value:
      identity: email
    exact: False
    case_sensitive: False

Identity data passed in through request:

{
  "email": "somebody@email.com"
}

Data to be processed:

[
    {
        "id": 1397429347,
        "email_contact": "[Somebody Awesome] SOMEBODY@email.com",
        "name": "Somebody Awesome"
    },
    {
        "id": 1397429348,
        "email_contact": "somebody@email.com",
        "name": "Somebody Awesome"
    }
]

Result:

[
    {
        "id": 1397429347,
        "email_contact": "[Somebody Awesome] SOMEBODY@email.com",
        "name": "Somebody Awesome"
    },
    {
        "id": 1397429348,
        "email_contact": "somebody@email.com",
        "name": "Somebody Awesome"
    }
]

We can configure how strict the filter is by setting exact and case_sensitive both to False. This allows our value to be a substring of a longer string, and to ignore case (upper vs lower case).

Post-Processor Config for dataset_reference value:

In this example we have a dataset called fides and a collection called customer populated from a previous request.

endpoints:
  - name: customer
  ...
  - name: orders
  ...
  read:
  ...
    postprocessors:
      - strategy: filter
        configuration:
          field: id
          value:
            dataset_reference: fides.customer.id

Identity data passed in through request:

{
  "email": "somebody@email.com"
}

Data to be processed:

[
    {
        "id": 1397429347,
        "email_contact": "somebody@email.com",
        "name": "Somebody Awesome"
    },
    {
        "id": 238475234,
        "email_contact": "somebody-else@email.com",
        "name": "Somebody Cool"
    }
]

Result:

[
    {
        "id": 1397429347,
        "email_contact": "somebody@email.com",
        "name": "Somebody Awesome"
    }
]

dataset_reference allows references to string and integer fields. However only references to string fields will work with the exact and case_sensitive settings.

Note: Type casting is not supported at this time. We currently only support filtering by string values. e.g. bob@mail.com and not 12344245.

Unwrap

Given a path to a dict/list, returns the dict/list at that location.

Configuration details

strategy: unwrap

configuration:

data_path (str): Gives the path to desired object. E.g. exact_matches.members will attempt to get the members object on the exact_matches object.

Example

Post-Processor Config:

- strategy: unwrap
  configuration:
    data_path: exact_matches.members

Data to be processed:

{
  "exact_matches": {
    "members": [
      { "howdy": 123 },
      { "meow": 841 }
    ]
  }
}

Result:

[
  { "howdy": 123 },
  { "meow": 841 }
]

Pagination Async SaaS integrations