In the previous years, organizations and companies relied on on-premise databases and data warehouses where they stored their data. On-premises data warehousing is a complicated process because it requires having in-house teams of IT professionals with a deep understanding of the software components of these systems and the need to be proficient in the hardware aspect of both servers and databases.
Other drivers to cloud data warehouse migration are that most organizations have clear usage peaks, and elasticity can be used when using cloud computing. The elasticity of cloud services eliminates many hidden costs and is a significant factor of cloud migration. This phenomenon significantly affects E-Commerce companies as users rush into their website on specific dates, causing massive traffic just for short periods. Eventually, these companies started to sell this extra capacity to other companies as on-demand services, giving rise to the modern cloud infrastructure providers.
Currently, modern cloud infrastructure providers offer Cloud Data Warehouses, which allow companies, organizations, and individuals to use databases through secure internet connections without buying on-premise physical servers or databases. These cloud data warehouses are generally databases optimized for the use of analytics. They are scalable because there is no practical limit to storage, and the user typically pays for the usage and storage (the latter being usually a low cost to encourage data storage).
Cloud data warehouses have generally been relational databases with support for standard flavors of SQL. Still, now there is increased support in semi-structured data analytics, as well as unstructured data.
In recent years, the development of Cloud Data Lakes started to set the path for a change of paradigm from “database based” Cloud Data Warehouses to more blob-centric databases. This change uses different technologies to leverage queryable file formats such as Avro, Parquet, and Delta tables, along with the use of massive amounts of unstructured data to provide more flexibility given the constant change in volume and type of data.
This concept allowed the development of new applications that make extensive use of these unstructured data types. Thanks to blob storage which is generally cheaper than relational databases, allowed the development of robust applications at substantially lower costs than ever before.