Speaker(s):

Optimize parallelism for reading from Apache Kafka to Dataflow

Jul-8 09:10-10:05 in Horizon Hall
Add to Calendar 07/08/2025 9:10 AM 07/08/2024 10:05 AM BS25: Optimize parallelism for reading from Apache Kafka to Dataflow

Reading from Apache Kafka into Google Cloud Dataflow can present performance challenges if not configured correctly. This session provides a practical guide to troubleshooting common parallelism issues and implementing best practices for optimal performance. We’ll cover key aspects such as understanding Dataflow’s Kafka source, effectively utilizing maxNumRecords and maxReadTime, and addressing potential bottlenecks. Learn how to diagnose and resolve issues related to uneven parallelism and latency, ensuring your real-time data pipelines operate smoothly and efficiently, referring to official Google Cloud Dataflow documentation.

Horizon Hall

Reading from Apache Kafka into Google Cloud Dataflow can present performance challenges if not configured correctly. This session provides a practical guide to troubleshooting common parallelism issues and implementing best practices for optimal performance. We’ll cover key aspects such as understanding Dataflow’s Kafka source, effectively utilizing maxNumRecords and maxReadTime, and addressing potential bottlenecks. Learn how to diagnose and resolve issues related to uneven parallelism and latency, ensuring your real-time data pipelines operate smoothly and efficiently, referring to official Google Cloud Dataflow documentation.