Speaker(s):

Optimize parallelism for reading from Apache Kafka to Dataflow

Jul-9 14:30-14:55 in The Bandshell
Add to Calendar 07/09/2025 2:30 PM 07/09/2025 2:55 PM America/New_York BS25: Optimize parallelism for reading from Apache Kafka to Dataflow

Reading from Apache Kafka into Google Cloud Dataflow can present performance challenges if not configured correctly. This session provides a practical guide to troubleshooting common parallelism issues and implementing best practices for optimal performance. We’ll cover key aspects such as understanding Dataflow’s Kafka source, effectively utilizing maxNumRecords and maxReadTime, and addressing potential bottlenecks. Learn how to diagnose and resolve issues related to uneven parallelism and latency, ensuring your real-time data pipelines operate smoothly and efficiently, referring to official Google Cloud Dataflow documentation.

The Bandshell

Reading from Apache Kafka into Google Cloud Dataflow can present performance challenges if not configured correctly. This session provides a practical guide to troubleshooting common parallelism issues and implementing best practices for optimal performance. We’ll cover key aspects such as understanding Dataflow’s Kafka source, effectively utilizing maxNumRecords and maxReadTime, and addressing potential bottlenecks. Learn how to diagnose and resolve issues related to uneven parallelism and latency, ensuring your real-time data pipelines operate smoothly and efficiently, referring to official Google Cloud Dataflow documentation.