A data warehouse is a central repository of information that users can use to run analyses that can be used in decision-making. Data is ingested into a data warehouse from several systems, including transactional systems, relational databases, devices, APIs, or any other sources, typically arriving on a regular cadence. Data consumers can access data stored in data warehouses through BI (business intelligence) tools, SQL clients, and other analytics applications. Generally, the users of data warehouses are business analysts, data engineers, data scientists, and decision-makers that use the data to power analytics reports, dashboards, and other analytics tools. Data warehouses are optimized systems that can serve data while executing queries on it to several users concurrently.
A data warehouse can be composed of one or multiple databases. Within each database, data is stored in a tabular format into individual tables. Tables are composed of rows and columns, out of which each column defines a variable. Each variable is defined as a data type, such as integer, data field, or string. A previously defined data model organizes tables into schemas, which can be regarded as folders. Data that is ingested into the data warehouse is stored in various tables described by the schema.
Data Warehouses: Main Use-Cases
- Improved decision-making through data insights
- Aggregate data from several sources
- Historical data analysis
- Implementation of data quality, consistency, and accuracy checks
- Separation of analytics processing from transactional databases
Data from several sources can be consolidated in data warehouses, acting as a central repository of historical data. This data can be used to create analytical views and reports that users can use for decision-making.
Four Main Characteristics of a Data Warehouse
- Domain oriented. They can store data on a particular topic or functional unit.
- Integrated. Data warehouses offer consistency across data types from several sources.
- Non-volatile: Data stored in a data warehouse is stable and does not change frequently.
- Time-dependent: They offer historical analytical views over time.
These characteristics tend to be more flexible with modern data warehouses, where data may change more commonly, and data warehouses cater to many teams in the organization.
Data warehouses offer high data throughput, enable users to slice and dice, and limit data selection for further exploration, independently of the granular level. It acts as the functional foundation for business intelligence environments, allowing users to create reports, dashboards, and other data interfaces. The data in a data warehouse is generally collected from several sources, such as transactional systems, application log files, and other applications. They offer an overarching and unique advantage to analyze large amounts of variant data and extract significant value from it, as well as maintain a historical record.