Archive

All published posts

2469 posts latest post 2026-05-08
Publishing rhythm
Apr 2026 | 47 posts
I recently discovered twint [1] by twintproject [2], and it’s truly impressive. An advanced Twitter scraping & OSINT tool written in Python that doesn’t use Twitter’s API, allowing you to scrape a user’s followers, following, Tweets and more while evading most API limitations. References: [1]: https://github.com/twintproject/twint [2]: https://github.com/twintproject
I like pytest-dev’s [1] project pluggy [2]. A minimalist production ready plugin system References: [1]: https://github.com/pytest-dev [2]: https://github.com/pytest-dev/pluggy
to-mc [1] has done a fantastic job with checksumdir [2]. Highly recommend taking a look. Simple package to compute a single deterministic hash of the file contents of a directory. References: [1]: https://github.com/to-mc [2]: https://github.com/to-mc/checksumdir

Minimal Kedro Pipeline

How small can a minimum kedro pipeline ready to package be? I made one within 4 files that you can pip install. It’s only a total of 35 lines of python, 8 in setup.py and 27 in mini_kedro_pipeline.py. 📝 Note this is only a composable pipeline, not a full project, it does not contain a catalog or runner. Minimal Kedro Pipeline # [1] I have everything for this post hosted in this gihub repo [2], you can fork it, clone it, or just follow along. Installation # [3] pip install git+https://github.com/WaylonWalker/mini-kedro-pipeline Caveats # [4] This repo represents the minimal amount of structure to build a kedro pipeline that can be shared across projects. Its installable, and drops right into your hooks.py or run.py modules. It is not a runnable pipeline. At this point I think the config loader requires to have a logging config file. This is a sharable pipeline that can be used across many different projects. Usage # [5] # hooks.py import mini_kedro_project as mkp class Pro...

Markdown Cli

This is a post that may be a work in progress for awhile, Its a collections of thoughts on managing my blog, but could be translated into anythiung that is just a collection of markdown. Listing things # [1] - posts - tags - draft posts data # [2] - frontmatter - filepath - content - template - html [3] render content # [4] - Markdown.Markdown - support extentsions frontmatter cleaning. # [5] - provide ways to hook in or clean up the frontmatter Markata.Markata methods # [6] - load - render - save Markata.Post methods # [7] - load - render - save Markata plugins # [8] - before_load - before_post_load - after_load - after_post_load - before_save - before_post_save - after_save - after_post_save Markata plugins # [9] - cleanse_frontmatter - html_feed - json_feed - rss_feed - save_posts CLI # [10] $ markata list tags python data $ markata [ { "title": "post title", "description": "this is a post", "filepath": "path_to.md", "content": "the ...

My Content Strategy For 2021

I am making another push in 2021 to get my content out in the world and meeting users where they are. See how I plan to execute. Platforms # [1] - waylonwalker.com - Twitter - DEV - hashnode - Medium - LinkedIn - Anchor Markdown # [2] My content is written in markdown, all markdown. I find that markdown does a really great job at getting out of the way and letting ideas flow onto the page. I am never fussing with fonts and formatting while physically writing posts. Not that I don’t spend way more time than I need to tweak these things on my own personal site where everything gets posted. Articles # [3] Much of what I create is inside of short articles that get posted to my personal site waylonwalker.com [4]. These will get cross-posted to DEV [5], hashnode [6], Medium [7]. I have made cross-posting a bit easier for myself by posting the markdown for each article next to the post on my personal site. Add .md to any post and there is the source. Should I be giving my art...
3 min read 💬 1

Quickly Edit Posts

Recently I automated starting new posts with a python script. Today I want to work on the next part that is editing those posts quickly. Automating my Post Starter [1] Check out this post about setting up my posts with python 🐍 Enter Bash # [2] For the process of editing a post I just need to open the file in vim quickly. I dont need much in the way of parsing and setting up the frontmatter. I think this is a simple job for a bash script and fzf. - change to the root of my blog - fuzzy find the post - open it with vim - change back to the directory I was in bash function # [3] For this I am going to go with a bash function. This is partly due to being able to track where I was and get back. Also the line with nvim will run fzf everytime you source your ~/.alias file which is not what we want. Lets setup the boilerplate. Its going to create a function called ep "edit post", track our current directory, create a sub function _ep. Then call that function and cd back to where...

Gitui is a blazing fast terminal git interface

Gitui is a terminal-based git [1] user interface (TUI) that will change the way that you work with git. I have been a long-time user of the git cli, and it’s been hard to beat, mostly because there is nothing that keeps my fingers on the keyboard quite like it, except gitui which comes with some great ways to very quickly walk through a git project. installation # [2] Go to their [releases]https://github.com/extrawurst/gitui/releases) page, download the latest build, and pop it on your PATH. I have the following stuffed away in some install scripts to get the latest version. install latest release GITUI_VERSION=$(curl --silent https://github.com/extrawurst/gitui/releases/latest | tr -d '"' | sed 's/^.*tag\///g' | sed 's/>.*$//g' | sed 's/^v//') wget https://github.com/extrawurst/gitui/releases/download/v${GITUI_VERSION}/gitui-linux-musl.tar.gz -O- -q | sudo tar -zxf - -C /usr/bin/ run gitui # [3] It opens blazing fast. gitui Quick Commits # [4] Sometimes I edit a number of fi...
2 min read

Kedro - My Data Is Not A Table

In python data science/engineering most of our data is in the form of some sort of table, typically a DataFrame from a library like pandas, spark, or dask. DataFrames are the heart of most pipelines # [1] These containers for data contain many convenient methods to manipulate table like data structures. Sometimes we leverage other data types, namely vanilla types like lists and dicts, or even numpy data types. What is Kedro [2] unfamiliar with kedro, check out this post Sometimes datasets are not tables # [3] There are times when our data doesn’t fit nicely into a DataFrame. Lucky for us Kedro has pickle support out of the box. Pickle is a way to store any python object to disk. Beware that pickle files coming from an unknown source can run malicous code and are considered unsafe. For the most part though when you read and write your own pickle files they are a good tool to consider. See more about pickle [4] from python.org. Cataloging Pickle # [5] I may have a dictionary ...
Check out asottile [1] and their project babi [2]. a text editor References: [1]: https://github.com/asottile [2]: https://github.com/asottile/babi

Quickly Change Conda Env With Fzf

Changing conda environments is a bit verbose, I use a function with fzf that both lists environments and selects the one I want in one go. Conda # [1] I have used conda as a virtual environment [2] tool for years now. I started using conda for its simplicity to install packages on windows, but now that has gotten so much better and it’s been years since I have run a conda install command. I’m sure that I could use a different environment manager, but it works for me and makes sense. What environment manager do you use for python? Conda environments are stored in a central location such as ~/miniconda3/envs/ and not with the project. They contain both the python interpreter and packages for that env. Conda create # [3] Conda environments are created with the conda create command. At this point, you will need to name your env and select the python version. conda create -n my_env python=3.8 After running this command you will have a directory ~/miniconda3/envs/my_env with a base...
3 min read

Vim Replace Visual Star

Replacing text based on whats in the current search register is a quite handy tool that I use often. I believe I picked this tip up from Nick Janetakis, check out his YouTube channel for some amazing vim tips. https://www.youtube.com/watch?v=fP_ckZ30gbs If there is one thing that I Like most about vim it’s the ability to hack on it and make it work well for you. Replacing text in vim # [1] Vim can often be a bit verbose, but that’s ok because we can hack on it, and make our own shortcuts and keybindings. For instance, finding and replacing text requires using a command at the vim command-line :. Replacing foo with bar looks like this :%s/foo/bar/g, the final g means all of the foos, not just the first one on the line. making it better # [2] I have a keybinding in my init.vim that will allow me to search for a pattern with the usual / character, page through them as normal with n and N, but when I press <C-R> it will populate the replace command for me so that all I need to do ...
2 min read 💬 3

Minimal Python Package

What does it take to create an installable python package that can be hosted on pypi? What is the minimal python package # [1] - setup.py - my_module.py This post is somewhat inspired by the bottle framework, which is famously created as a single python module. Yes, a whole web framework is written in one file. Directory structure # [2] . ├── setup.py └── my_pipeline.py setup.py # [3] from setuptools import setup setup( name="", version="0.1.0", py_modules=["my_pipeline", ], install_requires=["kedro"], ) name # [4] The name of the package can contain any letters, numbers, “_”, or “-”. Even if it’s for internal/personal consumption only I usually check for discrepancy with pypi so that you don’t run into conflicts. Note that pypi treats “-” and “_” as the same thing, beware of name clashes version # [5] This is the version number of your package. Most packages follow semver [6]. At a high level its three numbers separated by a . that follow the format major.minor.patc...
2 min read
Check out jameslittle230 [1] and their project stork [2]. 🔎 Impossibly fast web search, made for static sites. References: [1]: https://github.com/jameslittle230 [2]: https://github.com/jameslittle230/stork

If Tmux

I do much of my work from tmux, I love it so much that I want to setup some functionality that puts me in tmux even if I didn’t ask for it. Bash Function # [1] Bash function to check if the shell is in a tmux session. in_tmux () { if [ -n "$TMUX" ]; then return 0 else return 1 fi } Using the bash function # [2] I often open up vim to do some quite edits, but before I know it I have several splits open and I need access to another shell utility, but I forgot to start in tmux. This function makes sure tht I start in tmux everytime. Using if_tmux to ensure vim is opened in tmux. vim () { in_tmux \ && nvim \ || bash -c "\ tmux new-session -d;\ tmux send-keys nvim Space +GFiles C-m;\ tmux -2 attach-session -d; " } I am not quite sure if this is proper use of the && and ||, let me know if you have a better way to do one thing if in_tmux returns true and another if it returns faslse. References: [1]: #bash-function [2]: #using-the-bash-function
If you’re into interesting projects, don’t miss out on vim-commentary [1], created by tpope [2]. commentary.vim: comment stuff out References: [1]: https://github.com/tpope/vim-commentary [2]: https://github.com/tpope
I’m really excited about vim-fugitive [1], an amazing project by tpope [2]. It’s worth exploring! fugitive.vim: A Git [3] wrapper so awesome, it should be illegal References: [1]: https://github.com/tpope/vim-fugitive [2]: https://github.com/tpope [3]: /glossary/git/
The work on vim-surround [1] by tpope [2]. surround.vim: Delete/change/add parentheses/quotes/XML-tags/much more with ease References: [1]: https://github.com/tpope/vim-surround [2]: https://github.com/tpope
If you’re into interesting projects, don’t miss out on kedro-starters [1], created by kedro-org [2]. Templates for your Kedro projects. References: [1]: https://github.com/kedro-org/kedro-starters [2]: https://github.com/kedro-org

Save Vim Macro

If you are like me, you have created a macro or two that is pure glory, and you forget how you made it after a day or so, or you immediately want to store it away as a custom keybinding. As with most things with vim, it’s easy to do once you understand it. Creating a Macro # [1] One of the earliest things we all learn to do in vim is to create macros, custom sets of functionality stored in a register that can be replayed later. To create a macro, get into normal mode, then type q followed by a letter that you want to store the macro under. qq Note: a common throw-away macro register is q because it’s easy to hit qq from normal mode to start recording. Replaying a Macro # [2] Macros can be replayed using @ followed by the letter that you stored the macro under. @q Registers # [3] Registers are nothing more than a single character key mapping to a value of some text. As you yank, delete, or create macros in vim, it automatically stores text into these registers. When you hit...
3 min read 💬 3
Live Substitution In Neovim

Live Substitution In Neovim

Replacing text in vim can be quite frustrating especially since it doesn’t have live feedback to what is changing. Today I was watching Josh Branchaud’s Vim-Unalphabet series on Youtuve and realized that his vim was doing this and I had to have it. https://twitter.com/_WaylonWalker/status/1346081617199198210 How to do it # [1] I had to do a bit of searching and found a great post from vimcasts [2] that shows exactly how to get the live search and replace highlighting using inccomand :h inccommand # [3] 'inccommand' 'icm' string (default "") global "nosplit": Shows the effects of a command incrementally, as you type. "split" : Also shows partial off-screen results in a preview window. Works for |:substitute|, |:smagic|, |:snomagic|. |hl-Substitute| If the preview is too slow (exceeds 'redrawtime') then 'inccommand' is automatically disabled until |Command-line-mode| is done. Add this to your config # [4] I believe that this is a neovim only feature, add it into your ~...
khzaw [1] has done a fantastic job with vim-conceal [2]. Highly recommend taking a look. A vim plugin making use of vim’s conceal feature for additional visual eyecandy. References: [1]: https://github.com/khzaw [2]: https://github.com/khzaw/vim-conceal

Newsboat

Web browsers are a black hole of productivity. I try to use them as little as possible when it is time to focus. I try to use help, ?, or ?? with ipython, or –help at the command line as much as possible. What about that time I am trying to see what my online friends are posting on their sites? I used to used google reader quite heavily before that was taken down. Newsboat # [1] I am going to give a terminal rss reader a try for a bit and see how that goes for me. I have really struggled to get into an rss reader since google reader died. installation # [2] I installed with the reccomended snap for Ubuntu. sudo snap install newsboat Adding feeds # [3] super simple Running help for newsboat directed me towards their config files at the bottom. ❯ newsboat --help newsboat r2.22 usage: /snap/newsboat/3849/usr/local/bin/newsboat [-i <file>|-e] [-u <urlfile>] [-c <cachefile>] [-x <command> ...] [-h] -e, --export-to-opml export OPML feed to stdout -r, --refresh-on-start refresh f...

Large Refactor At The Command Line

As projects grow patterns that worked early on break and we need to change things to make the project easier to work with, and more welcoming to new developers. git # [2] Before you start mucking up a project with wild commands at the terminal check that you have a super clean git status. We may make some mistakes and need a way to undo 100’s files and git makes it really easy if you start with a clean history. git status If we are ready to begin work we should see a response like this. On branch main nothing to commit, working tree clean It would also be wise to do this inside of a branch. The minute you try to do something wild in your working branch someone will walk by and ask you to do a five-minute task, but your deep in refactoring and haven’t left yourself a clean way back. git branch my-big-refactor grepr # [3] Time for the meat of this refactor replacing text across our project. I often will pop this bash function into my terminal session and tweak it as needed. This...
4 min read

Ipython-Config

I use my ipython terminal daily. It’s my go to way of running python most of the time. After you use it for a little bit you will probably want to setup a bit of your own configuration. install ipython # [1] Activate your virtual environment [2] of choice and pip install it. Any time you are running your project in a virtual environment, you will need to install ipython inside it to access those packages from ipython. pip install ipython You are using a virtual environment right? Virtual environments like venv or conda can save you a ton of pain down the road. profile_default # [3] When you install ipython you start out with no config at all. Runnign ipython profile create will start a new profile called profile_default that contains all of the default configuration. ipython profile create This command will create a directory ~/.ipython/profile_default multiple configurations # [4] You can run multiple configurations by naming them with ipython profile create [profile_name...
2 min read

Custom Ipython Prompt

I’ve grown tired of the standard ipython prompt as it doesn’t do much to give me any useful information. The default one gives out a line number that only seems to add anxiety as I am working on a simple problem and see that number grow to several hundred. I start to question my ability 🤦‍♂️. Configuration # [1] If you already have an ipython config you can move on otherwise check out this post on creating an ipython config. Ipython-Config [2] The Dream Prompt # [3] I want something similar to the starship prompt I am using in the shell. I want to be able to quickly see my python version, environment name, and git [4] branch. - python version - active environment - git branch [5] This is my zsh prompt I am using for inspiration Basic Prompt # [6] This is mostly boilerplate that I found from various google searches, but this gets me a basic green chevron as my prompt. from IPython.terminal.prompts import Prompts, Token class MyPrompt(Prompts): def in_prompt_tokens(self...
3 min read
I’m impressed by vim-tmux-runner [1] from christoomey [2]. Vim and tmux, sittin’ in a tree… References: [1]: https://github.com/christoomey/vim-tmux-runner [2]: https://github.com/christoomey
I like fkromer’s [1] project awesome-kedro [2]. No description available. References: [1]: https://github.com/fkromer [2]: https://github.com/fkromer/awesome-kedro
Just starred machfiles [1] by ChristianChiarulli [2]. It’s an exciting project with a lot to offer. The dotfiles you see in all my videos References: [1]: https://github.com/ChristianChiarulli/machfiles [2]: https://github.com/ChristianChiarulli
Check out LunarVim [1] and their project LunarVim [2]. 🌙 LunarVim is an IDE layer for Neovim. Completely free and community driven. References: [1]: https://github.com/LunarVim [2]: https://github.com/LunarVim/LunarVim
Looking for inspiration? joelhooks-com [1] by joelhooks [2]. playing with static pages References: [1]: https://github.com/joelhooks/joelhooks-com [2]: https://github.com/joelhooks

Automating my Post Starter

One thing we all dread is mundane work of getting started, and all the hoops it takes to get going. This year I want to post more often and I am taking some steps towards making it easier for myself to just get started. When I start a new post I need to cd into my blog directory, start neovim in a markdown file with a clever name, copy some frontmatter boilerplate, update the post date, add tags, a description, and a cover. Todo List for starting a post # [1] - frontmatter template - Title - slug - tags - date - cover - description - create markdown file - open in neovim Lets Automate this # [2] This aint no proper cli # [3] hot and fast As with many thing running behind the scenes on this site, I am the one and only user, I have limited time, so this is going to be a bit hot and fast. Let’s create a file called new-post. start the script new-post #!python # new-post 👆 Works on my machine If this were something that had more users than me I would probably use some...

Windowing Python Lists

In python data science we often will reach for pandas a bit more than necessary. While pandas can save us so much there are times where there are alternatives that are much simpler. The itertoolsandmore-itertools` are full of cases of this. This post is a walkthrough of me solving a problem with more-itertools rather than reaching for a for loop, or pandas. I am working on a one-line-link expander for my blog. I ended up doing it, just by modifying the markdown with python. I first split the post into lines with content.split('\n'), then look to see if the line appears to be just a link. One more safety net that I wanted to add was to check if there was whitespace around the line, this could not simply be done in a list comprehension by itself. I need just a bit of knowledge of the surrounding lines, enter more-itertools. simplified rendering function # [1] I have a function that will check to see if the line should be expanded, then render the correct template. Fist step is to ...
1 min read
WaylonWalker [1] has done a fantastic job with devtainer [2]. Highly recommend taking a look. 🐳 (dotfiles) My personal development docker container base image References: [1]: https://github.com/WaylonWalker [2]: https://github.com/WaylonWalker/devtainer
WaylonWalker [1] has done a fantastic job with WaylonWalker [2]. Highly recommend taking a look. Learning in public References: [1]: https://github.com/WaylonWalker [2]: https://github.com/WaylonWalker/WaylonWalker
Check out aoc [1] by ThePrimeagen [2]. It’s a well-crafted project with great potential. 2020 References: [1]: https://github.com/ThePrimeagen/aoc [2]: https://github.com/ThePrimeagen
Check out ZaxR [1] and their project bulwark [2]. Bulwark is a package for convenient property-based testing of pandas dataframes. References: [1]: https://github.com/ZaxR [2]: https://github.com/ZaxR/bulwark
mariokostelac [1] has done a fantastic job with sagemaker-setup [2]. Highly recommend taking a look. Useful scripts for making AWS SageMaker better References: [1]: https://github.com/mariokostelac [2]: https://github.com/mariokostelac/sagemaker-setup
I like pypeaday’s [1] project aoc-2020 [2]. Advent of Code 2020 References: [1]: https://github.com/pypeaday [2]: https://github.com/pypeaday/aoc-2020
I’m really excited about auto-editor [1], an amazing project by WyattBlue [2]. It’s worth exploring! Auto-Editor: Efficient media analysis and rendering References: [1]: https://github.com/WyattBlue/auto-editor [2]: https://github.com/WyattBlue

Adding Audio to my blog posts

This is episode 1 of the Waylon Walker Audio experience, posts from waylonwalker.com [1]{.hoverlink} in audio form. So I have had this idea for awhile to add audio to my blog posts. The idea partly comes from the aws blog, if you have ever been on their blog you will have noticed that they have a voiced by amazon polly section. What to Expect # [2] Honestly I don’t know this is all new to me and I dont have much to go off of. For now its a test that may or may not work out. I will say that the time that I have available for clean audio is a bit limited so expect these to come out in batches as I get time to go back and record. What Not to Expect # [3] One thing that makes the aws blog really hard to listen to is the robotic voice, I definitely don’t want that. This will be voiced by a real human, Me. At the same time written text doesn’t translate directly to audio well so don’t necessarily expect the audio to be word for word. Code blocks # [4] There are a lot of code block...
Check out yetudada [1] and their project yetudada [2]. No description available. References: [1]: https://github.com/yetudada [2]: https://github.com/yetudada/yetudada
I’m impressed by quickpython [1] from timothycrosley [2]. A retro interactive coding environment powered by Python and nostalgia References: [1]: https://github.com/timothycrosley/quickpython [2]: https://github.com/timothycrosley

gatsby-remark-embedder

Inspired by discourse’s link expansion I am rolling out expansions for one line links on the blog waylonwalker [1]. I was able to find a gatsby plugin gatsby-remark-embedder [2] that expands one line links for social cards for popular platforms like twitter and YouTube through a repose from Kyle Mathews to my tweet. https://twitter.com/kylemathews/status/1329817928666005504 Use Cases # [3] This covers a couple of use cases I have with very little effort. - Twitter - YouTube install # [4] npm i gatsby-remark-embedder gatsby-plugin-twitter This was super quick and simple to setup, the only thing that was extra was to install the gatsby-plugin-twitter plugin as well as the gatsby-remark-embedder. enable # [5] // In your gatsby-config.js module.exports = { // Find the 'plugins' array plugins: [ `gatsby-plugin-twitter`, { resolve: `gatsby-transformer-remark`, options: { plugins: [ { resolve: `gatsby-remark-embedder`, options: { customTransformers: [ // Your custom t...

Expand One Line Links

I wanted a super simple way to cross-link blog posts that require as little effort as possible, yet still looks good in vanilla markdown in GitHub. I have been using a snippet that puts HTML [1] into the markdown. While this works, it’s more manual/difficult for me does not look the best, and does not read well as Goals for new card # [2] The new card should be fully automated to expand with title, description, and cover image. Bonus if I am able to attach a comment behind it. - fully automated - card expansion - Title - description - cover image Old Card # [3] If you can call it a card 🤣. This card was just an image wrapped in an anchor tag and a paragraph tag. I found this was the most consistent way to get an image narrower and centered in both GitHub and dev.to. <p style='text-align: center'> <a href='https://waylonwalker.com/notes/eight-years-cat/'> <img style='width:500px; max-width:80%; margin: auto;' src="https://images.waylonwalker.com/eight-years-cat.png" al...
astronomer [1] has done a fantastic job with dag-factory [2]. Highly recommend taking a look. Dynamically generate Apache Airflow DAGs from YAML configuration files References: [1]: https://github.com/astronomer [2]: https://github.com/astronomer/dag-factory
orchest [1] by orchest [2] is a game-changer in its space. Excited to see how it evolves. Build data pipelines, the easy way 🛠️ References: [1]: https://github.com/orchest/orchest [2]: https://github.com/orchest

Find and Replace in the Terminal.

grepr # [1] grepr() {grep -iRl "$1" | xargs sed -i "s/$1/$2/g"} ```bash grepr() {grep -iRl "$1" | xargs sed -i "s/$1/$2/g"} grepd # [2] grepd() {grep -iRl "$1" | xargs sed -i "/^$1/d"} CocSearch # [3] :CocSearch published: false -g *.md References: [1]: #grepr [2]: #grepd [3]: #cocsearch
gvanrossum [1] has done a fantastic job with patma [2]. Highly recommend taking a look. Pattern Matching References: [1]: https://github.com/gvanrossum [2]: https://github.com/gvanrossum/patma

Resume Tips

- customize for the job - Why are you a good fit? - What will you bring to the role? - Give real outcomes - give real experience - Stop tech vomiting - if you link to GitHub - Make a profile readme - Guide me to your best work - have some activity - if you link to LinkedIn - Provide some benefit that is not on your resume - Have a logical flow of experience (dont make me hunt for past experience) - Keep it under 2 pages - Who you know. - Reference real experience - Deployed 12 data pipelines with over 500 nodes to process 200GB of data at a Fortune 100 company - vs - Knowledge of Data Engineering methodology with python EC2 - Dont be so fluffy
1 min read