Machine Learning Platform Tooling with Apache Beam on Kubernetes

Jun-14 12:00-12:25 UTC
Room: Upper Bay

At MavenCode, we consult, develop and train on how to implement large-scale production-grade Machine learning workflows all on Kubernetes, and Apache Beam has gradually become a defacto standard for us in handling large-scale ML workloads

In recent years, the adoption of Kubernetes as a container orchestration platform has exploded, making it an ideal environment for deploying and managing distributed systems. At the same time, Apache Beam has emerged as a powerful framework for building and executing large-scale data processing pipelines that can be deployed across various execution engines and increasing native support for handling Machine Learning workloads and inferencing at scale

Machine learning operations (MLOps) are a crucial component of building successful machine learning applications, but it can be challenging to manage the complex workflows involved in MLOps. In this talk, we will explore how to build platform tools for MLOps using Apache Beam, an open-source framework for building batch and streaming data processing pipelines, and Kubernetes, a container orchestration platform.

We will discuss the benefits of this approach, including automating and streamlining the process of building and deploying machine learning applications, and the ability to support a range of tasks involved in a machine learning platform tooling such as data ingestion, preprocessing, feature engineering, model training, model serving, and monitoring. We will highlight real-world use cases and best practices for building and deploying machine learning platform tools, such as containerization, version control, and continuous integration/continuous deployment. Attendees will leave with a deeper understanding of how to leverage Apache Beam and Kubernetes to build machine learning platform tooling that can help drive success in machine learning projects.