Thank you for attending Beam Summit 2024! We will soon share the session recordings and slides. Sign up here for upcoming news.

Title Speaker(s) Recording Slides
Project Shield: How we use Beam to defend democracy and free expression, and how we got started!
Marc Howard
Data Lineage in Beam
Rohit Sinha
Ordered processing in Apache Beam
Sergei Lilichenko
Scaling Autonomous Driving with Apache Beam
Sayat Satybaldiyev & Arwin Tio
Introducing Ordered List States
Shunping Huang
Transitioning Uber Michelangelo's Batch Prediction from Apache Spark to Ray
Baojun Liu
A New Local Runner Appears: Deep dive on Prism
Robert Burke
Avoid HTTP Request Duplicates with SCIO, a custom AsyncHttpParDoFn and State & Timers
Alberto López Serna
How we Migrated our JSON DB to a Relational DB using Apache Beam / Dataflow
Lakshmanan Arumugam
Accelerating CDC Data Ingestion with Apache Beam: A Qlik-to-BigQuery Journey
Bipin Upadhyaya
Implementing a Beam SDK: A Deep Dive into the Swift SDK
Byron Ellis
Processing Data from a Web API: A step by step guide
Damon Douglas
How Beam ML Optimizes Serving Large Models
Danny McCormick
Introduction to Beam YAML
Jeff Kinard
Throttling Detection and Reactive Worker Downscaling
Yi Hu
Beam YAML and Protobuf
Ferran Fernandez & Austin Bennett
Multi-Modal LLM Data Processing with Apache Beam
Konstantin Buschmeier, Jasper Van den Bossche & Iris Luden
Troubleshooting Python pipelines with process monitoring tools.
Valentyn Tymofieiev
Breaking the Language Barrier: Easy Cross-Language with Generated Python Wrappers
Ahmed Abualsaud
Real-Time Fraud Prevention with Apache Beam
Hai Sadon
The SolaceIO connector: how was it made and why
Matt Mays
using pub/subIO writeMessageDynamic() function in a Python pipeline to use dynamic topic destination
Olu Akinlaja
Improving Stability for Running Python SDK with Flink Runner
Lydian Lee
Using Dead Letter Queues with Beam
John Casey
Using LLMs with Beam and RunInference
Reza Rokni
Innovating the Data & AI Platform
Yasmeen Ahmad
Lessons Learned from MLOps for GenAI at Google Scale
Prakash Chockalingam
Beam SDKs Don't Have to Look the Same
Robert Burke
Drools ParDo and SCIO: a goodbye microservices tale
Alberto López Serna
Troubleshooting Beam/Dataflow ML Pipelines Related Common Issues
Rajkumar Gupta
At Least Once Streaming vs Exactly Once: Cost saving vs data accuracy
Ihaffa Murtopo
BeamStack: An open source Framework for running Machine Learning Pipelines with Apache Beam
Olufunbi Babalola
Reuniting the Two Distant Cousins: Calling a Beam Pipeline from an Airflow Job
Sadeeq Akintola
A Low Code Structured Approach to Deploying Apache Beam ML Workloads on Kubernetes using BeamStack
Charles Adetiloye & Nate Salawe
Dataflow Streaming: The evolution of real-time data processing
Tom Stepp
Streaming modes & Vertical Scaling for cost effectiveness to customers
Sharan Teja Malyala
Beam YAML: Advanced topics
Jeff Kinard
RAG Data Ingestion Using Apache Beam
Jasper Van den Bossche & Konstantin Buschmeier
Usage Billing with BEAM @ LinkedIn
Narayanan Venkiteswaran & Jinjing Bi
Realtime Forecasting using Beam
Ravi Magham
Streaming Processing for RAG Architectures
Pablo Rodriguez Defino & Namita Sharma
Cost Optimization of Dataflow Pipelines
Sergei Lilichenko
Multiple Input, Multiple output, Multi-Modal Inference: Streaming ML with Dataflow
Wei Hsia
Dataflow CI/CD
Surjit Singh
RAG Data Ingestion and Enrichment Pipeline using Redis and OpenSearch Vector Database in Apache Beam
Ayush Pandey
Beam for Large-Scale, Accelerated ML Inference at Google
Uday Kalra