When define a Hudi table you must define one of two supported data storage types.
Supported Hudi dataset storage types:
- Copy on write
- Merge on read
When you create a Hudi dataset, you specify that the dataset is either copy on write or merge on read.
- Copy on Write (CoW) – Data is stored in a columnar format (Parquet), and each update creates a new version of files during a write. CoW is the default storage type.
- Merge on Read (MoR) Data is stored using a combination of columnar (Parquet) and row-based (Avro) formats. Updates are logged to row-based delta files and are compacted as needed to create new versions of the columnar files.