Speaker(s):

Parallelizing Skewed Hbase Regions using Splittable Dofn

Jun-14 14:00-14:25 in Palisades
Add to Calendar 06/14/2023 2:00 PM 06/14/2023 2:25 PM America/Los_Angeles AS24: Parallelizing Skewed Hbase Regions using Splittable Dofn

During HBase to Cloud BigTable Migrations, HBase snapshots will be imported to Cloud Bigtable. Each Snapshot contains several HBase regions and certain HBase regions can be quite large due to skewed data.

In this presentation along with code snippets and benchmark test results, we showcase how to parallelize a skewed HBase Regions using Splittable DoFn and reduce pipeline runtime.

Palisades

During HBase to Cloud BigTable Migrations, HBase snapshots will be imported to Cloud Bigtable. Each Snapshot contains several HBase regions and certain HBase regions can be quite large due to skewed data.

In this presentation along with code snippets and benchmark test results, we showcase how to parallelize a skewed HBase Regions using Splittable DoFn and reduce pipeline runtime.