Datasets
A dataset is a collection of data store objects such as tables or schemas from one or more data stores, that you wish to govern access to as a single unit.
For example, a set of tables in a Snowflake account which contain private customer information such as name, address and purchase history can be represented in Satori as a Customer Data dataset".
Data engineers create datasets as part of the data development lifecycle. Once a dataset is defined you can then assign a data stewards to manage the day to day operations of access to data.
When data consumers query data, Satori associates the query with the relevant datasets and applies the access rule permissions and policies that are defined on them.
Creating and Managing Datasets
To create a dataset, you require the Admin or the Editor role which is defined in the management console.
Dataset Stewards
To help you manage and maintain your dataset you can also assign dataset stewards to the dataset for performing the day to day operations of access to data.
The Dataset Steward can create, approve or deny user access rules, create security policies and masking profiles and assign them to the dataset. In addition, the dataset steward can edit the catagories in the data inventory tab of the dataset.
NOTE:The data steward cannot give Satori control over access to the dataset or change the default security policy.
Dataset Access Approvers
In addition to the dataset steward you can also assign dataset access approvers tasked with approving or denying access requests to the dataset. The access approvers do not have access to view or edit the dataset in the management console.
Adding a Dataset
To create a Dataset perform the following tasks:
-
Go to the Datasets view and click the Add button.
-
Provide a dataset name and description for the dataset, optionally assign dataset stewards and dataset access approvers.
-
Select datastore locations to include in the dataset and optionally, define the locations to exclude.
Checking Data Store Locations
Satori uses the longest match approach when checking if a data store location is included in the dataset. See the following dataset examples:
Included Locations
Finance Snowflake Account / Forecast database / Q2 schema
Excluded Locations
Finance Snowflake Account / Forecast database / Q2 schema / Orders
When querying any table other than the "Orders" table in the Q2 schema, Satori associates the query with this dataset and applies any permissions or policies that are defined on it.
Dataset User Access Rules
Permissions to access datasets are defined for individual users or groups and are limited to a predefined time range. In addition, Satori can automatically revoke permissions if they are unused. This helps organizations avoid excess and unused permissions.
Satori provides three main capabilities for controlling dataset access. These access controls can be used in parallel to streamline the process of managing access to data.
Dataset Permissions
Dataset access rule permissions enable data engineers and dataset stewards to grant access to datasets without requiring users to ask for access. Satori recommends that you use this method for providing access if you know which users or groups require access to a dataset and your organization's policy does not require an approval process.
NOTE: When users query data, Satori searches for the required permissions, if available Satori sends the query to the datastore.
Data Inventory
Satori provides you with a rich out-of-the-box taxonomy. The dataset data inventory provides a holistic view of the sensitive data and access patterns. In addition to the provided taxonomy, you can also add to it by creating customer classifier categories and custom classifiers.
User Access History
Every change to permissions or access request is audited by Satori.
User Access Requests
Enable access requests to allow users that do not have the required permissions to request access. When users query data they receive an access request notification via their Data Portal.
User Access Requests via Slack
User access requests can also be made via Slack. Users with access to the Satori Slack App can make data access requests by using the Slack command /satori access.
User Access Requests
User Access Requests are sent via email to the dataset's dataset stewards and appear in the management console.
Self-Service Access
Enable self-service access to allow users without the right to grant themselves predefined permissions. When users query data they receive a URL link enabling them to audit their access by specifying why they need to access the dataset. Once they submit the form they are granted with the relevant permissions that were defined on the dataset.
This method is the recommended alternative to the standard dataset permissions because it audits users access to datasets.
Managing Technical Metadata
Using the Data Inventory view of a dataset - data engineers or dataset stewards can review the results of the automatic data classification and override, remove or add any necessary tags. See the Data Inventory section for more details.
Implementing Custom Policies
Using the Custom Policy view of a dataset, enables data engineers or dataset stewards to implement custom data access policies using the Policy Engine.
See the Policy Engine Overview section for more details.