Posts tagged: python

All posts with the tag "python"

275 posts latest post 2026-03-31
Publishing rhythm
Feb 2026 | 1 posts

Creating pypi-list with kedro

I had an idea come to me via twitter. Short one word name packages are becoming hard to find on pypi. Short one word readable package names that are not a play on words are easy to remember, easy to spell correctly, and quick to type out.

I started with the simple index. Pypi provides a single page listing to every single package hosted on pypi via the simple-index

Using Kedro In Scripts

With the latest releases of kedro 0.17.x, it is now possible to run kedro pipelines from within scripts. While I would not start a project with this technique, it will be a good tool to keep in my back pocket when I want to sprinkle in a bit of kedro goodness in existing projects.

What is Kedro

If your just learning about kedro check out this post walking through it

...

Silence Kedro Logs

Kedro can have a chatty logger. While this is super nice in production so see everything that happened during a pipeline run. This can be troublesome while trying to implement a cli extension with clean output.

First, how does one silence a python log? Python loggers can be retrieved by the logging module’s getLogger function. Then their log level can be changed. Much of kedro’s chattiness comes from INFO level logs. I don’t want to hear about anything for my current use case unless it’s essential, i.e., a failure. In this case, I set the log levels to ERROR as most errors should stop execution anyways.

Getting a python logger is straightforward if we know the name of the logger. The following block will grab the logger object for the logger currently registered under the name passed in.

...

Python Diskcahe is locked

Running multiple processes using the same diskcache object can cause issues with locks. As I was trying to setup a rich Live display for markata I ran into issues where each part could not nun simultaneusly. As I had followed the instructions from discache it was not directly aparant to me, so I had to make a simple example to experiment and play with at a small scale.

Minimum reporducible error is one of my superpowers in development. I do this very often to sus out what is really happening. My day to day work is processing data with python, I keep a number of very small data sets handy to break and fix. This helps separate complexities of the project and the problem.

Markata has a lot going on. It’s a plugins all the way down static site generator built in python. Trying to find the root cause through the layers of plugin and cli modules can be a pain, but in this case building a very simple minimum reporducible error was much easier.

...

3 min read

Vim Fugitive

:G :G status :G commit :G add % :Gdiff :G push :Glog

Add current file and commit with diff in a split #

function! s:GitAdd() exe "G add %" exe "G diff --staged" exe "only" exe "G commit" endfunction :command! GitAdd :call s:GitAdd() nnoremap gic :GitAdd<CR> 

:on[ly] #

C-W o

:on[ly] will make the current buffer the only one on the screen. This is super helpful as many of fugitive commands will open in a split by default.

cycle through the jumplist

...

What is if __name__ == "__main___", and how do I use it.

When a python module is called it is assigned the __name__ of __main__ otherwise if it’s imported it will be assigned the __name__ of the module.

Let’s create a module to play with __name__ a bit. We will call this module nodes.py. It is a module that we may want to run by it’self or import and use in other modules.

#!python # nodes.py if __name__ == "nodes": import sys import __main__ print(f"you have imported me {__name__} from {sys.modules['__main__'].__file__}") if __name__ == "__main__": print("you are running me as main")

I have set this module up to execute one of two if statements based on whether the module it’self is being ran or if the module is being imported.

...

3 min read

Zev Averbach Interview

Zev Averbach, Frustrated spreadsheet jockey to software developer at 36

Q: Tell me about your journey as a spreadsheet jockey into Data Engineering?

A: First of all, it’s hilarious that I accidentally found your questions for this interview by Googling myself. 😊

...

Pytest capsys

Testing print/log statements in pytest can be a bit tricky, capsys makes it super easy, but I often struggle to find it.

capsys is a builtin pytest fixture that can be passed into any test to capture stdin/stdout. For a more comprehensive description check out the docs on capsys

Simply create a test function that accepts capsys as an argument and pytest will give you a capsys opject.

1 min read

Building Rich a Dev Server

Draft Post

I’ve really been digging @willmcgugan’s rich library for creating TUI like interfaces in python. I’ve only recently started to take full advantage of it.

I am working on a project in which I want to have a dev server running continuously in the background. I really like dev servers theat...

...

fix crlf for entire git repo

Final Result # git checkout main git reset --hard git rm -rf --cached . echo &#34;* text=auto&#34; > .gitattributes git add .
1 min read

Automatic Conda Environments

I have automated my process to create virtual environments in my python projects, here is how I did it.

I’ve really been digging my new tmux session management setup. Now I have leveled it up by adding direnv to my workflow. It will execute a shell script whenever I cd into a directory. One thing I wanted to add to this was, automatic activation of python environments whenever I cd into a directory, or create a new environment if one does not exist.

https://waylonwalker.com/tmux-nav-2021/

...

3 min read

How I Review Pipeline Code

I have started doing more regular PR’s on my teams Kedro pipelines. I generally take a two phase approach to the review in order to give the reviewee both quick and detailed feedback.

What is Kedro

Phase1 is typically a quick scan over the PR right within the PR window in my browser.

...

2 min read