Posts tagged: python

All posts with the tag "python"

275 posts latest post 2026-03-31
Publishing rhythm
Feb 2026 | 1 posts

Python 3.8 came out two and a half years ago and I have yet to really lean in on the walrus operator. Partly because it always seemed like something kinda silly (my use cases) to require a python version bump for, and partly because I really didn’t understand it the best. Primarily I have wanted to use it in comprehensions, but I did not really understand how.

Now that Python 3.6 is end of life, and most folks are using at least 3.8 it seems time to learn and use it.

:=

The assignment operator in python is more commonly referred to as the walrus operator due to how := looks like a walrus. It allows you to assign and use a variable in a single expression.

...

Kedro rich is a very new and unstable (it’s good, just not ready) plugin for kedro to make the command line prettier.

There is no pypi package yet, but it’s on github. You can pip install it with the git url.

pip install git+https://github.com/datajoely/kedro-rich

Kedro run #

You can run your pipeline just as you normally would, except you get progress bars and pretty prints.

kedro run

kedro rich pretty run

Kedro catalog #

Listing out catalog entries from the command line now print out a nice pretty table.

...

I recently found a really great plugin by mhinz to open files in neovim from a different tmux split, without touching neovim at all.

neovim-remote is not a neovim plugin at all, it’s a python cli that you can install with pip. Unlike the repo suggests, I use pipx to install nvr.

pipx install neovim-remote

How I use it #

I have this added to my .envrc that is in every one of my projects....

...

As I am toying around with textual, I am wanting some popup user input to take over. Textual is still pretty new and likely to change quite significantly, so I don’t want to overdo the work I put into it, So for now on my personal tuis I am going to shell out to tmux.

The main issue is that when you are in a textual app, it kinda owns the input. So if you try to run another python function that calls for input it just cant get there. There is a textual-inputs library that covers this, and it might work really well for some use cases, but many of my use cases have been for things that are pre-built like copier, and I am trying to throw something together quick.

textual is still very beta

Part of this comes down to the fact that textual is still very beta and likely to change a lot, so all of the work I have done with it is for quick and dirty, or fun side projects.

...

Mermaid diagrams provide a way to display graphs defined as plain text. Some markdown renderers support this as a plugin. GitHub now supports it.

You can define nodes like this in mermaid, and GitHub will now render them as a pretty graph diagram. Its rendered in svg, so its searchable with control f and everything.

graph TD; A-->B; A-->C; B-->D; C-->D-->OUT; E-->F-->G-->OUT

Here is what the example looks like on GitHub

Glances is a system monitor with a ton of features, including docker processes.

I have started using portainer to look at running docker processes, its a great heavy-weight docker process monitor. glances works as a great lightweight monitor to just give you the essentials, ( Name, Status, CPU%, MEM, /MAX, IOR/s, IOW/s, Rx/s, Tx/s, Command)

You will need to install glances to use the glances webui. We can still use pipx to manage our virtual environment for us so that we do not need to do so manually or run the risk of globally installed package dependency hell.

pipx install glances pipx inject glances "glances[docker]"

You will be presented with this success message.

...

Glances has a pretty incredible webui to view system processes and information like htop, or task manager for windows.

The nice thing about the webui is that it can be accessed from a remote system. This would be super nice on something like a raspberry pi, or a vm running in the cloud. Its also less intimidating and easier to search if you are not a terminal junky.

You will need to install glances to use the glances webui. We can still use pipx to manage our virtual environment for us so that we do not need to do so manually or run the risk of globally installed package dependency hell.

pipx install glances pipx inject glances "glances[web]"

You will be presented with this success message.

...

Glances is a fully featured system monitoring tool written in python. Out of the box it’s quite similar to htop, but has quite a few more features, and can be ran without installing anything other than pipx, which you should already have installed if you do anything with python.

pipx run glances

Once you run this you will be in a tui application similar to htop. You can kill processes with k, use left and right arrows to change the sorting column, and up and down to select different processes.

python requirements text files can in fact depend on each other due to the fact that you can pass pip install arguments right into your requirements.txt file. The trick is to just prefix the file with a -r flag, just like you would if you were installing it with pip install

Lets create two requirements files in a new directory to play with.

mkdir requirements-nest cd requirements-nest touch requirements.txt requirements_dev.txt

Then add the following to each requirements file.

# requirements.txt kedro[pandas.ParquetDataSet]

# requirements_dev.txt -r requirements.txt ipython 

Installing #

Installing requirements_dev.txt will install both ipython and pandas since it includes the base requirements file.

...

Reading eventbridge rules from the command line can be a total drag, pipe it into visidata to make it a breeze.

I just love when I start thinking through how to parse a bunch of json at the command line, maybe building out my own custom cli, then the solution is as simple as piping it into visidata. Which is a fantastic tui application that had a ton of vim-like keybindings and data features.

I often run shell commands from python with Popen, but not often enough do I set up error handline for these subprocesses. It’s not too hard, but it can be a bit awkward if you don’t do it enough.

import subprocess from subprocess import Popen # this will run the shell command `cat me` and capture stdout and stderr proc = Popen(["cat", "me"], stdout=subprocess.PIPE, stderr=subprocess.PIPE) # this will wait for the process to finish. proc.wait()

reading from stderr #

To get the stderr we must get it from the proc, read it, and decode the bystring. Note that we can only get the stderr object once, so if you want to do more than just read it you will need to store a copy of it.

proc.stderr.read().decode()

Better Exception #

Now that we can read the stderr we can make better error tracking for the user so they can see what to do to resolve the issue rather than blindly failing.

In looking for a way to automatically generate descriptions for pages I stumbled into a markdown ast in python. It allows me to go over the markdown page and get only paragraph text. This will ignore headings, blockquotes, and code fences.

import commonmark import frontmatter post = frontmatter.load("post.md") parser = commonmark.Parser() ast = parser.parse(post.content) paragraphs = '' for node in ast.walker(): if node[0].t == "paragraph": paragraphs += " " paragraphs += node[0].first_child.literal

It’s also super fast, previously I was rendering to html and using beautifulsoup to get only the paragraphs. Using the commonmark ast was about 5x faster on my site.

When I originally wrote this post, I did not realize at the time that commonmark duplicates nodes. I still do not understand why, but I have had success duplicating them based on the source position of the node with the snippet below.

For an embarassingly long time, til today, I have been wrapping my dict gets with key errors in python. I’m sure I’ve read it in code a bunch of times, but just brushed over why you would use get. That is until I read a bunch of PR’s from my buddy Nic and notice that he never gets things with brackets and always with .get. This turns out so much cleaner to create a default case than try except.

Lets consider this example for prices of supplies. Here we set a variable of prices as a dictionary of items and thier price.

prices = {'pen': 1.2, 'pencil', 0.3, 'eraser', 2.3}

Except KeyError #

What I would always do is try to get the key, and if it failed on KeyError, I would set the value (paper_price in this case) to a default value.

try: paper_price = prices['paper'] except KeyError: paper_price = None

.get #

What I noticed Nic does is to use get. This feels just so much cleaner that it’s a one liner and feels much easier to read and...

...

BeautifulSoup is a DOM like library for python. It’s quite useful to manipulate html. Here is an example to find_all html headings. I stole the regex from stack overflow, but who doesn’t.

sample.html

Lets make a sample.html file with the following contents. It mainly has some headings, <h1> and <h2> tags that I want to be able to find.

<!DOCTYPE html> <html lang="en"> <body> <h1>hello</h1> <p>this is a paragraph</p> <h2>second heading</h2> <p>this is also a paragraph</p> <h2>third heading</h2> <p>this is the last paragraph</p> </body> </html>

Get the headings with BeautifulSoup #

Lets import our packages, read in our sample.html using pathlib and find all headings...

...

I keep my nodes short and sweet. They do one thing and do it well. I turn almost every DataFrame transformation into its own node. It makes it must easier to pull catalog entries, than firing up the pipeline, running it, and starting a debugger. For this reason many of my nodes can be built from inline lambdas.

Here are two examples, the first one lambda x: x is sometimes referred to as an identity function. This is super common to use in the early phases of a project. It lets you follow standard layering conventions, without skipping a layer, overthinking if you should have the layer or not, and leaves a good placholder to fill in later when you need it.

Many times I just want to get the data in as fast as possible, learn about it, then go back and tidy it up.

from kedro.pipeline import node my_first_node = node( func=lambda x: x, inputs='raw_cars', output='int_cars', tags=['int',] ) my_first_node = node( func=lambda cars: cars[['mpg', 'cyl', 'disp',]].query('disp>200'), inputs='raw_cars', output='int_cars', tags=['pri',] )

Note: try not to take the idea...

As you work on your kedro projects you are bound to need to add more dependencies to the project eventually. Kedro uses a fantastic command pip-compile under the hood to ensure that everyone is on the same version of packages at all times, and able to easily upgrade them. It might be a bit different workflow than what you have seen, let’s take a look at it.

Before you start mucking around with any changes to dependencies make sure that your git status is clean. I’d even reccomend starting a new branch for this, and if you are working on a team potentially submit this as its own PR for clarity.

git status git checkout main git checkout -b add-rich-dependency

requirements.in #

New requirements get added to a requirements.in file. If you need to specify an exact version, or a minimum version you can do that, but if all versions generally work you can leave it open.

# requirements.in rich

Here I added the popular rich package to my requirements.in file. Since I am ok with the latest version I am not going to pin anything,...

...

I am a huge believer in practicing your craft. Professional athletes spend most of their time honing their skills and making themsleves better. In Engineering many spend nearly 0 time practicing. I am not saying that you need to spend all your free time practicing, but a few minutes trying new things can go a long way in how you understand what you are doing and make a hue impact on your long term productivity.

What is Kedro

practice building pipelines with #kedro today

Go to your playground directory, and if you don’t have one, make one.

...

I have added a hotkey to my copier template setup to quickly access all my templates at any time from tmux. At any point I can hit <c-b><c-b>, thats holding control and hitting bb, and I will get a popup list of all of my templates directory names. Its an fzf list, which means that I can fuzzy search through it for the template I want, or arrow key to the one I want if I am feeling insane. I even setup it up so that the preview is a list of the files that come with the template in tree view.

bind-key c-b popup -E -w 80% -d '#{pane_current_path}' "\ pipx run copier copy ~/.copier-templates/`ls ~/.copier-templates |\ fzf --header $(pwd) --preview='tree ~/.copier-templates/{} |\ lolcat'` . \ "

I’ve had this on my systems for a few weeks now and I am constantly using it for my tils, blogs, and my .envrc file that goes into all of my projects to make sure that I have a

I often pop into my blog from neovim with the intent to look at just a single series of posts, til, gratitude, or just see todays posts. Markata has a great way of mapping over posts and returning their path that is designe exactly for this use case.

Markata listing out posts from the command line

To tie these into a Telescope picker you add the command as the find_command, and comma separate the words of the command, with no spaces. I did also --sort,date,--reverse in there so that the newest posts are closest to the cursor.

nnoremap geit <cmd>Telescope find_files find_command=markata,list,--map,path,--filter,date==today<cr> nnoremap geil...

...