Tags
With the realease of kedro==0.17.2 came a new module in the project template
pipeline_registry.py. Here are some notes that I learned while playing with
this new module.
migrating to pipeline_registry.py #
- create a
src/<package-name>/pipeline_registry.pyfile create a register_pipelinesfunction inpipeline_registry.pythat mirrors the- register_pipelines method from your
hooks.pymodule do not bring the hook_impldecorator remove register_pipelines method on yourProjectHooks- class
You should now have something that looks like this in your
src/<package-name>/pipeline_registry.py.
"""Project pipelines.""" from typing import Dict from kedro.pipeline import Pipeline def register_pipelines() -> Dict[str, Pipeline]: """Register the project's pipelines. Returns: A mapping from a pipeline name to a ``Pipeline`` object. """ return {"__default__": Pipeline([])}
pipeline_registry only works in
kedro>=0.17.2
Conflict Resolution #
What happens If I register pipelines in both places
I was not able to find any official documentation on how conflict resolution
worked so I stepped into a project and added to both my hooks.py and
pipeline_registry.py file. I noticed that it would pick up pipelines from
both modules, but pipelines from hooks.py always take precedence. The entire
duplicate pipeline will be over written by the one from hooks.py.
kedro automatically merges pipelines from both hooks.py takes precedence
Ready to update #
In my experience there were no issues upgrading from 0.17.1 to 0.17.2. I
would reccomend only having one register_pipelines so decide to migrate to
the new pipeline_registry.py or keep it in your hooks.py, but both is only
going to lead to confusion.