A hybrid data repository that combines elements of a data lake and a data warehouse and supports multiple workloads including business intelligence, data science, and self-service analytics. At a minimum, they support SQL constructs and cloud-native object stores.
Added Perspectives
The emerging “data lakehouse” concept describes a hybrid data management environment that combines the characteristics of a data warehouse and a data lake.
- Kevin Petrie in Data Lakehouses Hold Water (thanks to the Cloud Data Lake)
June 8, 2020 (Blog)
A lakehouse is described by its advocates as a combination of a data lake and a data warehouse that implements warehouse-like data structures and data management functions on low-cost storage that is typically used for data lakes.
- Dave Wells in An Architect’s View of the Data Lakehouse: Perplexity and Perspective
June 8, 2020 (Blog)
The term “data lakehouse” is a metaphor, just like “data warehouse” and “data lake”. These metaphors communicate conceptually what the technology tries to do physically. As such, a data lakehouse is a data structure that tries to combine the characteristics of a data warehouse and a data lake. In that sense, almost every modern data management offering can be considered a data lakehouse.
- Wayne Eckerson in All Hail, the Data Lakehouse! (If Built on a Modern Data Warehouse)
June 8, 2020 (Blog)
Relevant Content
Jun 08, 2020 - Economic, elastic and open cloud-based data lakes efficiently run business intelligence workloads in ways that smell a lot like data warehouses.
Related Terms