What is Beam Summit?

The goal of Beam Summit is to connect a community of professionals around the world who use, contribute, and are learning Apache Beam.

This annual conference provides space to share use cases, performance, and resource optimizations, discuss pain points, and talk about the benefits of implementing Apache Beam in organizations.

The event aims to bring together the Apache Beam community to discuss the project’s status, its technical advances, and its future.

Contents are focused on sharing:

  • New use cases from companies using Apache Beam.
  • Community-driven talks.
  • Technical deep dives.
  • In-depth workshops.

About Apache Beam

Apache Beam is an open-source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open-source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Flink, Apache Spark, and Google Cloud Dataflow.

Apache Beam is particularly useful for embarrassingly parallel data processing tasks, in which the problem can be decomposed into many smaller bundles of data that can be processed independently and in parallel. You can also use Beam for Extract, Transform, and Load (ETL) tasks and pure data integration. These tasks are useful for moving data between different storage media and data sources, transforming data into a more desirable format, or loading data onto a new system.

More info about Apache Beam.