How small can a minimum kedro pipeline ready to package be? I made one within 4 files that you can pip install. It's only a total of 35 lines of python, 8 in setup.py and 27 in mini_kedro_pipeline.py. Minimal Kedro Pipeline I have everything for th...
2021-1-20
In python data science/engineering most of our data is in the form of some sort of table, typically a DataFrame from a library like pandas, spark, or dask. DataFrames are the heart of most pipelines These containers for data contain many convenient m...
2021-1-14
There are many reasons that you should be using kedro. If you are on a team of Data Scientists/Data Engineers processing DataFrames from many data sources should be considering a pipeline framework. Kedro is a great option that provides many benef...
2020-11-1
Kedro 0.16.6 is out! Let's take a look through the release notes Deployment Docs This is really exciting to see more deployment options coming from the kedro team. It really shows the power of the framework. The power of some of these orchestrations...
2020-10-25
If we take a look at the release notes I see one major feature improvement on the list, auto-discovery of hooks. ``` markdown Major features and improvements Enabled auto-discovery of hooks implementations coming from installed plugins. ``` This on...
2020-8-1
Why use kedro catalog? While using the catalog alone will not reap all of the benefits of the framework, it does get you and your project ready for the full framework eventually. For me the full benefit of the catalog comes when you combine it with...
2020-6-29
kedro 0.16.2 just dropped last week with a long-awaited feature... catalog search! I went as far as monkey patching this into each of my projects. I work jump between a few really big projects that have tons of datasets. Being able to quickly sear...
2020-6-22
Passing inputs into kedro is a key concept. Understanding how it accepts a single catalog key as input is quite trivial that easily makes sense, but passing a list or dictionary of catalog entries can be a bit confusing. args/*args review Check out...
2020-6-19
kedro is an open-source data pipeline framework. It provides guardrails to set your project up right from the start without needing to know deeply how to set up your own python library for data pipelining. It includes great ways to manipulate catal...
2020-2-24