How to build production-ready data science pipelines with Kedro PyData Eindhoven 2023

How to build production-ready data science pipelines with Kedro
.ical

11-30, 15:15–15:45 (Europe/Amsterdam), Bohr

Great news! You have just joined a new cool data science project. The client wants you to deliver a maintainable, reproducible and production-ready data pipeline. Where do you best start?

In this talk we demonstrate how Kedro, an open-source Python framework was created to do just that: build production-ready data science code from the get-go. We will demonstrate with a concrete example how we leveraged Kedro to built an analytics solution for energy cost optimization at a large Dutch agricultural player.

You have just received great news. You are about to join a new exciting project. The client wants you to deliver a maintainable, reproducible and production-ready data science pipeline.

How do you best start? Because… It takes really long to put code in production and we have to rewrite and restructure large parts of it… I have to think about Sphinx, black, Cookiecutter Data Science, Docker, Python Logging, virtual environments, Pytest, and more… People on my team all have different levels of exposure to software engineering best-practice.

In this talk we demonstrate how we benefit from Kedro to tackle these hurdles of creating machine learning products. Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code – code that easily transitions into production. We will focus on the Kedro project structure, the decoupling between the data and code layers, nodes, pipelines, and the built-in visualizations.

We end the talk with sharing a recent project where we leveraged Kedro. Together with a Dutch horticulture company, we helped them optimize the energy cost mix in their greenhouses. We built and easily put into production an end-to-end Kedro optimization pipeline that demonstrates promising results of 10-15% energy cost reduction.

This presentation is aimed at data scientists, data engineers and machine-learning engineers alike that need to develop and deploy production-level data pipelines. No prior knowledge is required.

Prior Knowledge Expected –

No previous knowledge expected

Tim Brakenhoff

Ionut Barbu

Ionut is lead of data science at Bright Cape, a data solution consultancy company in Eindhoven. He is passionate about leveraging data and AI technologies to help companies optimize their operations.

Before joining Bright Cape, Ionut was with McKinsey & Company where he leveraged his data science expertise to serve multinational organizations in various industries and geographies. Back in the Brainport region, he is excited to collaborate with companies to help them advance on their data and analytics journey.

How to build production-ready data science pipelines with Kedro .ical 11-30, 15:15–15:45 (Europe/Amsterdam), Bohr

How to build production-ready data science pipelines with Kedro
.ical

11-30, 15:15–15:45 (Europe/Amsterdam), Bohr