将数据写入到存储桶后 , 如下是 Hudi 目标表的数据格式 。
+-------------------+---------------------+-------------------------------------------------------------------------+----------------------+--------------------------------------------------------------------------+---------+--------------+---------------+---------------+-------------------+-------------------+-------------------+--------+|_hoodie_commit_time|_hoodie_commit_seqno |_hoodie_record_key |_hoodie_partition_path|_hoodie_file_name |seller_id|prod_category |product_name |product_package|discount_percentage|eff_start_ts |eff_end_ts |actv_ind|+-------------------+---------------------+-------------------------------------------------------------------------+----------------------+--------------------------------------------------------------------------+---------+--------------+---------------+---------------+-------------------+-------------------+-------------------+--------+|20220722113258101 |20220722113258101_0_0|seller_id:3412,prod_category:Healthcare,eff_end_ts:253402300799000000 |actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|3412 |Healthcare |Dolo 650 |10 |10 |2022-04-01 16:30:45|9999-12-31 23:59:59|1 ||20220722113258101 |20220722113258101_0_1|seller_id:1234,prod_category:Home Essential,eff_end_ts:253402300799000000|actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|1234 |Home Essential|Hand Towel |12 |20 |2021-10-20 06:55:22|9999-12-31 23:59:59|1 ||20220722113258101 |20220722113258101_0_2|seller_id:4565,prod_category:Gourmet,eff_end_ts:253402300799000000 |actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|4565 |Gourmet |Dairy Milk Silk|6 |30 |2021-06-12 20:30:40|9999-12-31 23:59:59|1 ||20220722113258101 |20220722113258101_0_3|seller_id:1234,prod_category:Detergent,eff_end_ts:253402300799000000 |actv_ind=1 |a94c9c58-ac6b-4841-a734-8ef1580e2547-0_0-29-1219_20220722113258101.parquet|1234 |Detergent |Tide 2L |6 |15 |2021-12-15 15:20:30|9999-12-31 23:59:59|1 |+-------------------+---------------------+-------------------------------------------------------------------------+----------------------+--------------------------------------------------------------------------+---------+--------------+---------------+---------------+-------------------+-------------------+-------------------+--------+2.假设我们的增量数据存储在下表中(非Hudi格式,可以是Hive) 。
+---------+-------------+-----------------+---------------+-------------------+-------------------+|seller_id|prod_category|product_name |product_package|discount_percentage|eff_start_ts |+---------+-------------+-----------------+---------------+-------------------+-------------------+|1234 |Detergent |Tide 5L |6 |25 |2022-01-31 10:00:30||4565 |Gourmet |Dairy Milk Almond|12 |45 |2022-06-12 20:30:40||3345 |Stationary |Sticky Notes |4 |12 |2022-07-09 21:30:45|+---------+-------------+-----------------+---------------+-------------------+-------------------+
推荐阅读
- iptables使用详解
- 华为车载智慧屏值得买吗_华为车载智慧屏使用评测
- Pytest进阶使用
- 如何使用 pyqt 读取串口传输的图像
- 数据科学学习手札144 使用管道操作符高效书写Python代码
- 你们觉得华为手机卡不卡,使用体验如何(华为加装nm卡缺点)
- 古墓丽影10怎么打飞机(古墓丽影10怎么使用榴弹)
- JavaFx 使用字体图标记录
- uni-app 如何优雅的使用权限认证并对本地文件上下起手
- 手把手教你玩转 Gitea|使用 Helm 在 K3s 上安装 Gitea