What is SQL Server Data Masking?
Data masking is a data security technique that allows you to obfuscate sensitive data, so it is not useful if compromised by an attacker. The masked copy is used to replace real data for testing, demos, or other use cases that don’t require the data itself.
Read our complete guide to dynamic data masking (across all data stores).
Microsoft SQL Server built-in security features enable two types of data masking:
- Static data masking—creates high quality data for application development and testing, without revealing sensitive information. Realism is important for development and testing teams to more effectively identify defects early in the development cycle. SQL Server can create a sanitized copy of a database with all sensitive information altered.
- Dynamic data masking (DDM)—used to apply role-based (object level) security to databases/applications, masking data as it is retrieved by a user. This is mainly applicable to read only scenarios. SQL Server can configure DDM on specific database fields, hiding sensitive data in query results. This is easy to use with existing applications, as it does not require changes to queries.
In this article, you will learn:
- What is SQL Server Static Data Masking?
- What is SQL Server Dynamic Data Masking?
- SQL Server Dynamic Data Masking Types
- Static Data Masking vs. Dynamic Data Masking: Pros and Cons
- SQL Server Security with Satori
What is SQL Server Static Data Masking?
In SQL Server, the Static Data Masking feature lets you create clean copies of your databases, modified to obfuscate all sensitive information, allowing them to be shared with non-production users.
Static Data Masking can be used for development, testing, analysis and reports, technical support, and sharing databases with third parties or contractors. For organizations subject to GDPR, it can be used to eliminate personal information from databases, while preserving the database structure.
Static data masking lets you configure masking behavior for each column in the database. For each selected column, it replaces the original data with new, masked data, created according to your configuration.
Note: Static Data Masking is irreversible—you cannot retrieve the original data after it has been masked.
Another SQL Server feature that lets you selectively enable access to data tables is SQL Server Row Level Security (RLS) – learn more in our detailed guide
What is SQL Server Dynamic Data Masking?
Dynamic data masking (DDM) enables you to control which users view masked data and which roles can view sensitive data. You can use DDM to prevent unauthorized access to certain pieces of data by limiting the amount of sensitive data revealed.
The main goal of DDM is to limit the amount of exposure to sensitive data. Use it as an additional tool to support your overall security efforts. DDM is available in Azure SQL Database and SQL Server 2016 (13.x) and later.
DDM lets you hide the sensitive data pooled together when users run queries. You can hide data by configuring DDM on specific database fields. You can implement masking rules within query results. This process does not modify the masked data and can be applied on existing apps.
Typically, a DDM runs a main data masking policy directly on top of database fields containing sensitive information. You can define certain users or roles with access to the masked sensitive data. The majority of DDM processes support partial, full, and random masking, as well as simple Transact-SQL commands that let you define and manage masking jobs.
Related content: read our guide to SQL server roles
SQL Server Dynamic Data Masking Types
Here are four data masking types you can use on SQL Server:
- Default—this function lets you create a rule that masks the entire value when users with read-only privileges query the data. To do this, you need to specify a function name, as well as an empty set of parentheses. The function does not take arguments, and columns are masked according to the data type of the specified column.
- Email—this function lets you create a rule that masks the first letter and the domain. Domains that do not end in .com are masked as a .com domain. For example, yXXX@XXXX.com.
- Random—this function lets you create a rule that replaces all numerical values with a random value according to a specified range. This can confuse users into thinking they are seeing the right information.
- Partial—this function lets you partially mask data. However, it only works for string type columns. To do this, you need to go to the MASKED WITH clause of the function parameter and define “partial(start characters, mask, end characters” as the value.
Static Data Masking vs. Dynamic Data Masking: Pros and Cons
Let’s look at some of the comparative advantages and disadvantages of the two types of data masking in SQL Server, to help you choose the method that is most suitable for the task at hand.
Advantages of Static Data Masking:
- Permanent deletion—sensitive data is permanently deleted when the data transformation is applied to the database. If a statically protected database is compromised, there is no sensitive data for the attacker to steal.
- No effect on performance—all data transformations are applied in advance, so there is no performance impact per transaction.
- Protects production copies—provides the best protection for copies of production databases, enabling access via other applications and native queries.
- Simplifies security of data copies—there is no need to define granular object-level security policies, because all the sensitive data in the database has been masked.
Disadvantages of Static Data Masking:
- Batch processing—masking is applied to the data store via batch processing, not in real time, a process that can take between minutes and several hours.
- Permanent deletion—this is also a disadvantage, because original data is permanently lost after the masking operation, so this method cannot be used to protect the production database, only for production copies.
Advantages of Dynamic Data Masking:
- Prevents unauthorized access—makes it easy to prevent unauthorized disclosure of sensitive data. Administrators can allow privileged users or roles to access real data, and restrict access to other users, ensuring they only see masked data.
- Easy to configure—can be set up easily, and used with Transact-SQL commands, with no changes to application code.
- Full and partial masking—enables additional masking features, including full masking, partial masking, and randomized masking for numerical data.
- Central policy—makes it possible to set a central policy for data masking and apply it to all database users.
Disadvantages of Dynamic Data Masking:
- Not a complete solution—to achieve database security, you must use additional measures. There is no guarantee that all sensitive data will be correctly masked, and unauthorized users may eventually gain access to sensitive data.
- Does not encrypt the data—does not protect the underlying data, only masks it when responding to a query. To protect the underlying data, combine DDM with other SQL Server security features such as encryption, auditing, and row-level security.
- Limitations—you cannot define masking for the COLUMN_SET, FILESTREAM, or the Always Encrypted column. Masked columns cannot be used as keys for full-text indexes.
SQL Server Data Masking with Satori
Satori Universal Masking is using data classification in conjunction with dynamic masking. Users can define which transformations they wish to apply on any data type instead of columns. It works across SQL server and other data stores such as Azure Synapse, Snowflake and AWS Redshift.
For example, instead of defining a masking policy on each column in a SQL server table where an email address is stored, Satori users can define that once and for all by specifying how they want email addresses to be masked. This significantly reduces the burden of maintaining a masking solution.
Satori also provides a set of out-of-the-box profile templates that users can use to create their own masking profiles