Speaker(s):

Data Quality in ML Pipelines

Jul-8 09:10-10:05 in Palisades
Add to Calendar 07/08/2025 9:10 AM 07/08/2024 10:05 AM BS25: Data Quality in ML Pipelines

Demonstrate two approaches for integrating data quality into ML pipelines: Schema based approach and UDF based approach, where Apache Beam does the data quality based filtering. If there is time, demonstrate how to integrate data quality related features into the dataset using a PreTransform component that takes in a UDF.

Palisades

Demonstrate two approaches for integrating data quality into ML pipelines: Schema based approach and UDF based approach, where Apache Beam does the data quality based filtering. If there is time, demonstrate how to integrate data quality related features into the dataset using a PreTransform component that takes in a UDF.