Beam data pipelines on microservice architectures

Jul-19 16:45-17:10 in 203
Add to Calendar 07/19/2022 4:45 PM 07/19/2022 5:10 PM America/Los_Angeles AS24: Beam data pipelines on microservice architectures

Wayfair is the world’s largest online destination for all things home including furniture, household items, fixtures, appliances etc. Unparalleled selections and high quality imagery are keys to provide a rich & unique user experience. Photo studios are expensive to operate and require significant time to produce an image. 3D modeling and imagery is one of the main focus areas of investment and the in-house applications were redesigned and developed using domain-driven design patterns of software engineering.

Microservices focuses on decoupling domains and is antithesis to data integrations where the data across domains needs to be aggregated and analyzed together. In this session we will talk on using Apache Beam for building real-time data pipelines in a decoupled & scalable way to derive key operational metrics and build notification systems. Lessons learned, wins & pitfalls on utilizing domain events as the source will be covered.

Over the last couple of years with Google cloud migrations, the adoption of Cloud Dataflow (Beam) has significantly increased at Wayfair. It has become an integral component of stream processing tech with the retirement of legacy Storm & KStream pipelines.

203

Wayfair is the world’s largest online destination for all things home including furniture, household items, fixtures, appliances etc. Unparalleled selections and high quality imagery are keys to provide a rich & unique user experience. Photo studios are expensive to operate and require significant time to produce an image. 3D modeling and imagery is one of the main focus areas of investment and the in-house applications were redesigned and developed using domain-driven design patterns of software engineering.

Microservices focuses on decoupling domains and is antithesis to data integrations where the data across domains needs to be aggregated and analyzed together. In this session we will talk on using Apache Beam for building real-time data pipelines in a decoupled & scalable way to derive key operational metrics and build notification systems. Lessons learned, wins & pitfalls on utilizing domain events as the source will be covered.

Over the last couple of years with Google cloud migrations, the adoption of Cloud Dataflow (Beam) has significantly increased at Wayfair. It has become an integral component of stream processing tech with the retirement of legacy Storm & KStream pipelines.