Multi-Modal LLM Data Processing with Apache Beam

Speaker(s):

Multi-Modal LLM Data Processing with Apache Beam

By Konstantin Buschmeier Jasper Van den Bossche Iris Luden

Sep-4 15:00-15:25 in Mariposa Grove

Add to Calendar 09/04/2024 3:00 PM 09/04/2024 3:25 PM America/New_York BS25: Multi-Modal LLM Data Processing with Apache Beam

Large language models are well known for their performance on generation tasks like summarization but they also excel at many classical tasks like classification, named-entity recognition, or information extraction. Multi-modal LLMs similarly achieve state of the art performance on document understanding. This makes them vital for modern data processing pipelines.

Apache Beam is a powerful framework to define and execute batch and streaming data processing pipelines. Recent releases introduced many tools to facilitate machine learning workflows like ML Transforms, RunInference, and Enrichment transform.

In this talk we will introduce an application that combines Beam’s ML capabilities and LLMs to extract product requests from various document types of customer emails to facilitate the automatic fulfillment of orders.

Mariposa Grove

Download slides