Drafts

Draft and unpublished posts

0 posts

Ipython Ninjitsu

- ?docstring - ??sourcecode - %run - %debug - %autoreload - %history - autoformat - %reset - !shell commands ?docstring # [1] Stop going to google everytime your stuck and stay in your workflow. The ipython ? is a superhero for productivity and staying on task. from kedro.pipeline import Pipeline Pipeline? Init signature: Pipeline( nodes: Iterable[Union[kedro.pipeline.node.Node, ForwardRef('Pipeline')]], *, tags: Union[str, Iterable[str]] = None, ) Docstring: A ``Pipeline`` defined as a collection of ``Node`` objects. This class treats nodes as part of a graph representation and provides inputs, outputs and execution order. Init docstring: Initialise ``Pipeline`` with a list of ``Node`` instances. Args: nodes: The iterable of nodes the ``Pipeline`` will be made of. If you provide pipelines among the list of nodes, those pipelines will be expanded and all their nodes will become part of this new pipeline. tags: Optional set of tags to be applied to all the pipeli...

Compare Directories In Bash

Today I needed to check for articles that used the same slug from two directories, bash made it super simple.

1 min

Testing Data Pipelines

Lint/Format/Doc - black - flake8 - interrogate - mypy Pipeline Assertions - pipeline constructs - pipeline as expected nodes - pipeline has minimum nodes - test minimum tags - test alternate tags Catalog Assertions - test catalog follows naming structure - Node Tests - test function does the correct operations on test data Great Expectations

Kedro Factory

Dynamically generate kedro pipelines with yaml or script Inspiration - dag-factory [1] References: [1]: https://github.com/ajbosco/dag-factory
1 min read

rebrand

- simple landing page - https://swyx.io - joel on software [1] - recent - reading lists - More from waylon just above footer - 4x2 grid - link strategy - latest post - next/prev - similar tags - search in nav - tag stickers - simple cards? - bookmarks? - nav style stinks - single post template - flat routes no need to /blog /notes - post types - 🌳 full - 🌱 budding - πŸ–Š Note - πŸ’» hot tip - usage of tags - MDX - stories - slides - ⚠ - ❌ - βœ” - kedro viz - charts - inlink component - https://joshwcomeau.com/ - auto-card oneline links - meta posts - about - uses - how site is built - how to search - stories TODO # [2] - review package.json - update package.json Done # [3] - ahrefs - fix canonical urls - fix broken inlinks - convert to one post template - References: [1]: https://www.joelonsoftware.com/ [2]: #todo [3]: #done
1 min read

Avoid Nesting Loops in Python

Nesting loops inside of each other in python makes for much harder code to understand, it takes more brain power to understand, and is thus more error prone than if its avoidable. One issue with this complexity is that toy examples may make sense, but most real example will grow and become more deeply nested over time. Avoiding this complexity from the start can help simplify the project in the future. setup # [1] Lets take a pretty simple example where we are using a ficticious library to get some sales data for our transportation company. The api allows us to fetch teh sales data for one class of vehicle and one region at a time. import pandas as pd from datastore import get_sales # ficticious library cars = ['sedan', 'coupe', 'hatchback'] regions = ['US', 'CA', 'MX'] ❌ Nesting Loops # [2] We have setup to fetch our data with two lists that represent the vehicles and regions that we want to analyze. We know that we need to make a call to get_sales for every vehicle and regio...
2 min read

List the latest files to change in a git repo

while read file; do echo $(git log --pretty=format:%ad -n 1 --date=raw -- $file) $file; done < <(git ls-tree -r --name-only HEAD | grep static/stories) | sort -r | head -n 3 | cut -d " " -f 3

Kedro Basics

Learn Kedro in 5 days Day 0 Setup # [1] - vm - install - python - editor Day 1 # [2] - kedro new - kedro viz Day 2 # [3] - catalog - filter catalog - load data - fsspec Day 3 # [4] - pipeline - nodes Day 4 # [5] - filter pipeline - run partial pipeline Day 5 # [6] - kedro docker - GitHub Actions Advanced Kedro # [7] - hooks - custom datasets - modular pipelines References: [1]: #day-0-setup [2]: #day-1 [3]: #day-2 [4]: #day-3 [5]: #day-4 [6]: #day-5 [7]: #advanced-kedro

025.md

setup

1 min

026.md

setup

1 min

Upcoming Posts

Upcoming posts to the blog. Have an idea edit this post [1] , submit a PR and we will talk. 🧠 # [2] - Kedro run changed - How I manage Environments - My Data Workflow. - Daily Schedule - desk - keeb - Material Shell - Why blog - search with fuse.js - Testing a blog with ahrefs - matrix testing in github actions - Think like a Senior Dev - Editor # [3] - tmux - vim - shortcuts - gitui - fzf - devinsideyou [4] Core # [5] - gracefully adopt kedro - catalog - in progress - pipeline - kedro - Silent Logger - Templated config loader - params/env - 10 reasons you shouldnt use kedro - 10 reasons to use - filter viz - Steel-toes env - Why framework - How I write pipelines - when I write pipelines - pipeline node templates - Convert a notebooks to pipelines - Testing Pipelines - professional python - cookiecutter - flake8 - black - mypy - pre-commit - click - pytest - git [6] - parametrize - environment variables - My top pandas methods - Ac...
2 min read

023

Find and replace Groups in VSCode $1 referrs to the second group

1 min

022

_

1 min

021

_

1 min

020

_

1 min

019

1 min

017

**

1 min

018

1 min