Thank you for attending Beam Summit 2024! We will soon share the session recordings and slides. Sign up here for upcoming news.

Title Speaker(s) Recording Slides

A Low Code Structured Approach to Deploying Apache Beam ML Workloads on Kubernetes using BeamStack

Charles Adetiloye & Nate Salawe

A New Local Runner Appears: Deep dive on Prism

Robert Burke

Accelerating CDC Data Ingestion with Apache Beam: A Qlik-to-BigQuery Journey

Bipin Upadhyaya

At Least Once Streaming vs Exactly Once: Cost saving vs data accuracy

Ihaffa Murtopo

Avoid HTTP Request Duplicates with SCIO, a custom AsyncHttpParDoFn and State & Timers

Alberto López Serna

Beam for Large-Scale, Accelerated ML Inference at Google

Uday Kalra

Beam SDKs Don't Have to Look the Same

Robert Burke

Beam YAML and Protobuf

Ferran Fernandez & Austin Bennett

Beam YAML: Advanced topics

Jeff Kinard

BeamStack: An open source Framework for running Machine Learning Pipelines with Apache Beam

Olufunbi Babalola

Breaking the Language Barrier: Easy Cross-Language with Generated Python Wrappers

Ahmed Abualsaud

Cost Effective Solutions for Beam pipelines in Dataflow

Sharan Teja Malyala

Cost Optimization of Dataflow Pipelines

Sergei Lilichenko

Data Lineage in Beam

Rohit Sinha

Dataflow CI/CD

Surjit Singh

Dataflow Streaming: The evolution of real-time data processing

Tom Stepp

Drools ParDo and SCIO: a goodbye microservices tale

Alberto López Serna

How Beam ML Optimizes Serving Large Models

Danny McCormick

How we Migrated our JSON DB to a Relational DB using Apache Beam / Dataflow

Lakshmanan Arumugam

Implementing a Beam SDK: A Deep Dive into the Swift SDK

Byron Ellis

Improving Stability for Running Python SDK with Flink Runner

Lydian Lee

Innovating the Data & AI Platform

Yasmeen Ahmad

Introducing Ordered List States

Shunping Huang

Introduction to Beam YAML

Jeff Kinard

Lessons Learned from MLOps for GenAI at Google Scale

Prakash Chockalingam

Multi-Modal LLM Data Processing with Apache Beam

Konstantin Buschmeier, Jasper Van den Bossche & Iris Luden

Ordered processing in Apache Beam

Sergei Lilichenko

Processing Data from a Web API: A step by step guide

Damon Douglas

Project Shield: How we use Beam to defend democracy and free expression, and how we got started!

Marc Howard

RAG Data Ingestion and Enrichment Pipeline using Redis and OpenSearch Vector Database in Apache Beam

Ayush Pandey

RAG Data Ingestion Using Apache Beam

Jasper Van den Bossche & Konstantin Buschmeier

Real-Time Fraud Prevention with Apache Beam

Hai Sadon

Realtime Forecasting using Beam

Ravi Magham

Reuniting the Two Distant Cousins: Calling a Beam Pipeline from an Airflow Job

Sadeeq Akintola

Scaling Autonomous Driving with Apache Beam

Sayat Satybaldiyev & Arwin Tio

Streaming Processing for RAG Architectures

Pablo Rodriguez Defino & Namita Sharma

The SolaceIO connector: how was it made and why

Matt Mays

Throttling Detection and Reactive Worker Downscaling

Yi Hu

Transitioning Uber Michelangelo's Batch Prediction from Apache Spark to Ray

Baojun Liu

Troubleshooting Beam/Dataflow ML Pipelines Related Common Issues

Rajkumar Gupta

Troubleshooting Python pipelines with process monitoring tools.

Valentyn Tymofieiev

Usage Billing with BEAM @ LinkedIn

Narayanan Venkiteswaran & Jinjing Bi

Using Dead Letter Queues with Beam

John Casey

Using LLMs with Beam and RunInference

Reza Rokni

using pub/subIO writeMessageDynamic() function in a Python pipeline to use dynamic topic destination

Olu Akinlaja

Workshop: Multiple Input, Multiple output, Multi-Modal Inference: Streaming ML with Dataflow

Wei Hsia