Data flattening is associated with transforming data into plain text files stored in file systems rather than in databases or data warehouses. Flat files are records that generally consist of single portions of data, which can be, for example, tabular spreadsheet files. Other kinds of flat files can be header records containing names of variables and metadata in different files, delimited flat files such as CSV or tab-separated files, fixed-format files with consistent length, or flattened or denormalized data of various types. These denormalized files are then collected into a single set of records or a limited number of tables.
Data Flattening will be frequently applied to semi-structured data such as JSON files to convert them to clear and transform the data into tabular formats that can later be used natively by these databases to improve query performance. Most of these processes will cause some information to be lost. Therefore it is common to store these raw files in secure locations in case that the information needs to be processed and loaded again. The process of Data Flattening will also limit the analytic capability, as the original structure is lost.
The benefit of applying data flattening operations will depend on the amount of data stored in relational databases compared to the number of semi-structured files present in data repositories. Having a significant number of semi-structured files can require using databases that have native support for those semi-structured file formats. Depending on the context in which the organization or company operates, data flattening can be a valuable tool to improve query performance and to be able to use business definitions on the data structure, rather depend on the raw format of the files. These business definitions will help enable data analytics looking to create insightful analytics to have a more precise understanding of the contents of the data and understand it at a much higher level. Data engineers can also reverse data flattening operations, assuming that there is sufficient info about the structure.
Data Flattening can also refer to different ways to structure data in OLTP and OLAP data warehouses. In OLAP data warehousing and database design, flat data means that the data was initially scattered in several tables, and now it is structured in a single table. In OLTP datasets is might refer to databases that have just a single master data table with 1NF characteristics.
Cloud Data Security with Satori
Satori, The DataSecOps platform, gives companies the ability to enforce security policies from a single location, across all databases, data warehouses and data lakes. Such security policies can be data masking, data localization, row-level security and more.