With the realease of
kedro==0.17.2
came a new module in the project template
pipeline_registry.py
. Here are some notes that I learned while playing with
this new module.
migrating to
pipeline_registry.py
-
create a
src/<package-name>/pipeline_registry.py
file create a -
register_pipelines
function inpipeline_registry.py
that mirrors the -
register_pipelines method from your
hooks.py
module do not bring the -
hook_impl
decorator remove register_pipelines method on yourProjectHooks
- class
You should now have something that looks like this in your
src/<package-name>/pipeline_registry.py
.
"""Project pipelines."""
from typing import Dict
from kedro.pipeline import Pipeline
def register_pipelines() -> Dict[str, Pipeline]:
"""Register the project's pipelines.
Returns: A mapping from a pipeline name to a ``Pipeline`` object.
"""
return {"__default__": Pipeline([])}
pipeline_registry only works in
kedro>=0.17.2
Conflict Resolution
What happens If I register pipelines in both places
I was not able to find any official documentation on how conflict resolution
worked so I stepped into a project and added to both my
hooks.py
and
pipeline_registry.py
file. I noticed that it would pick up pipelines from
both modules, but pipelines from
hooks.py
always take precedence. The entire
duplicate pipeline will be over written by the one from
hooks.py
.
kedro automatically merges pipelines from both hooks.py takes precedence
Ready to update
In my experience there were no issues upgrading from
0.17.1
to
0.17.2
. I
would reccomend only having one
register_pipelines
so decide to migrate to
the new
pipeline_registry.py
or keep it in your
hooks.py
, but both is only
going to lead to confusion.