Lakehouse Architecture Deep Dive: Iceberg, Delta Lake, and Hudi Compared

October 22, 2025 · 6 min read

Metadata Morph

AI & Data Engineering Team

The data lakehouse is built on a deceptively simple idea: add a metadata and transaction layer on top of cheap object storage, and you get warehouse-grade reliability at lake-scale costs. The table format is that layer — and three formats dominate the market: Apache Iceberg, Delta Lake, and Apache Hudi.

All three solve the same core problem. The differences in how they solve it have real consequences for query performance, streaming support, tooling compatibility, and operational complexity. This post cuts through the marketing to give you a technical basis for choosing.

What a Table Format Actually Does

Without a table format, a collection of Parquet files in S3 is just files. There's no concept of a table, no transaction guarantees, no schema history, and no efficient way to find which files contain which data.

A table format adds:

A metadata layer — a manifest of which files make up the current table state
ACID transactions — atomic writes so readers never see partial updates
Schema evolution — add, rename, or reorder columns without rewriting data
Partition evolution — change how data is partitioned without full rewrites
Time travel — query the table as it existed at any past snapshot
File-level statistics — min/max values per file enable query engines to skip irrelevant files

Apache Iceberg

Iceberg was developed at Netflix and donated to the Apache Software Foundation. It's the youngest of the three but has gained the most momentum in the last two years, particularly in the AI/ML space.

Architecture:

Catalog (Glue / Nessie / REST)
    │
    ▼
Metadata file (JSON)        ← current table state pointer
    │
    ▼
Manifest list               ← list of manifest files for this snapshot
    │
    ▼
Manifest files              ← list of data files + statistics per file
    │
    ▼
Data files (Parquet/ORC/Avro)

The multi-level metadata hierarchy is Iceberg's key design choice. Manifest files contain file-level statistics (min/max per column), enabling aggressive pruning without a full table scan.

Strengths:

Partition evolution — change partition strategy (e.g., daily → hourly) without rewriting data. No other format matches this.
Hidden partitioning — queries don't need to know about partitions; the engine handles it automatically
Multi-engine support — Spark, Flink, Trino, Presto, DuckDB, StarRocks, Snowflake (external tables), BigQuery — widest engine compatibility
Row-level deletes — efficient delete files rather than full partition rewrites (critical for GDPR compliance)
Branching and tagging — WAP (Write-Audit-Publish) pattern for safe data quality workflows

Weaknesses:

More complex catalog setup than Delta Lake
Smaller default ecosystem around streaming upserts vs. Hudi

Best for: Multi-engine environments, AI/ML platforms, organizations prioritizing open standards and avoiding vendor lock-in.

Delta Lake

Delta Lake was developed by Databricks and open-sourced in 2019. It's the most widely adopted format in Spark-heavy environments and has the richest feature set within the Databricks platform.

Architecture:

_delta_log/ directory
    │
    ├── 00000000000000000000.json    ← transaction log entries (JSON)
    ├── 00000000000000000001.json
    ├── ...
    └── 00000000000000000010.checkpoint.parquet  ← periodic checkpoint

Delta stores all transaction history as JSON log entries in a _delta_log directory alongside the data files. Checkpoints compact the log periodically for performance.

Strengths:

Simplest setup — no separate catalog required; the _delta_log directory is self-contained
DML operations — MERGE, UPDATE, DELETE are first-class, highly optimized in Spark
Change Data Feed — built-in CDC stream of row-level changes, ideal for downstream consumers
OPTIMIZE + ZORDER — compaction and multi-dimensional clustering in one command
Liquid Clustering (new) — automatic, adaptive clustering replacing static partitioning

Weaknesses:

Tightest coupling to Spark/Databricks — other engines have lagged in compatibility
Partition evolution is less flexible than Iceberg
_delta_log can become a bottleneck on very high-frequency write tables

Best for: Databricks-centric stacks, teams with heavy Spark usage, workloads requiring frequent DML operations.

Apache Hudi

Hudi (Hadoop Upserts Deletes and Incrementals) was developed at Uber and has the longest history of the three. It was designed specifically for streaming upserts — a use case the others have caught up to but Hudi still leads in maturity.

Architecture:

Hudi uses two table types with different optimization profiles:

Copy-on-Write (CoW) — rewrites entire file on every update. Faster reads, slower writes.
Merge-on-Read (MoR) — writes updates to delta log files, merges on read. Faster writes, slightly slower reads.

Strengths:

Streaming upserts — the most mature implementation for high-frequency, key-based upserts
Incremental queries — query only records that changed since a given commit, natively
MoR table type — write performance advantage for CDC and streaming use cases
Record-level index — Bloom filter and HBase-backed indexes for fast key lookups

Weaknesses:

Steeper learning curve — two table types, multiple index options, more configuration surface
Multi-engine support has improved but still behind Iceberg
Community and tooling ecosystem smaller than the other two

Best for: Real-time data ingestion from CDC streams, use cases with high-frequency keyed upserts, Uber/LinkedIn-style event pipelines.

Side-by-Side Comparison

Feature	Iceberg	Delta Lake	Hudi
ACID transactions	✓	✓	✓
Time travel	✓	✓	✓
Schema evolution	✓	✓	✓
Partition evolution	✓ (best)	Limited	Limited
Streaming upserts	Good	Good	Best
Multi-engine support	Best	Good	Improving
DML (MERGE/UPDATE)	✓	✓ (best)	✓
Row-level deletes	✓	✓	✓
Catalog required	Yes	No	No
Databricks native	Via OSS	✓	Via OSS
AI/ML ecosystem	Best	Good	Good

Decision Guide

Choose Iceberg if:

You need to query the same data from multiple engines (Spark + Trino + Flink + DuckDB)
AI/ML workloads are a priority — broadest Python and notebook ecosystem support
You want to avoid any single vendor dependency
Partition evolution matters for your use case

Choose Delta Lake if:

Your stack is primarily Databricks or Spark
You need the most mature MERGE/UPDATE/DELETE operations
Change Data Feed for downstream streaming consumers is a key requirement
You want the simplest possible setup (no catalog required)

Choose Hudi if:

Your primary use case is high-frequency streaming upserts from a CDC source
You need MoR tables for write-heavy workloads with moderate read SLAs
You're already running a Hudi deployment and it's working well

Production Recommendation

For new greenfield deployments targeting AI readiness in 2025, Iceberg is the default choice. The multi-engine support and rapidly growing ecosystem — including native Snowflake and BigQuery external table support — means you're not betting on a single platform. The Nessie catalog gives you Git-like branching semantics that pair naturally with dbt's testing-before-promotion workflow.

Book a strategy session to design your lakehouse architecture.

What a Table Format Actually Does​

Apache Iceberg​

Delta Lake​

Apache Hudi​

Side-by-Side Comparison​

Decision Guide​

Production Recommendation​

What a Table Format Actually Does

Apache Iceberg

Delta Lake

Apache Hudi

Side-by-Side Comparison

Decision Guide

Production Recommendation