What is Open Data Lakehouse?
Cloudera supports a Data Lakehouse architecture by pre-integrating and unifying the capabilities of Cloudera Data Warehouse and Data Lakes, to support data engineering, business intelligence, and machine learning – all on a single platform. Cloudera support for an open data lakehouse brings high-performance, self-service reporting and analytics to your business – simplifying data management for both for data practitioners and administrators.
Open Data Lakehouse components
- Support for Apache Iceberg 1.3 access and processing in CDP Private Cloud Base 7.1.9 and higher versions
- Compute engines (Hive, Impala, Spark, Flink) integration for accessing and processing Iceberg datasets concurrently
- SDX integration with Iceberg catalog
- Iceberg table maintenance from Spark and replication
- Iceberg Catalog set to HiveCatalog for Metastore management of Iceberg Tables
- Certified HDFS and Ozone storage