Published

All published posts

2540 posts latest post 2026-06-16 simple view
Publishing rhythm
May 2026 | 58 posts

Background Tasks in Python for Data Science

This post is intended as an extension/update from background tasks in python [1]. I started using background the week that Kenneth Reitz released it. It takes away so much boilerplate from running background tasks that I use it in more places than I probably should. After taking a look at that post today, I wanted to put a better data science example in here to help folks get started. This post is intended as an extension/update from background tasks in python [1]. I started using background the week that Kenneth Reitz released it. It takes away so much boilerplate from running background tasks that I use it in more places than I probably should. After taking a look at that post today, I wanted to put a better data science example in here to help folks get started. I use it in more places than I probably should Before we get into it, I want to make a shout out to Kenneth Reitz for making this so easy. Kenneth is a python God for all that he has given to the community in so many w...
If you’re into interesting projects, don’t miss out on starship [1], created by starship [2]. ☄🌌️ The minimal, blazing-fast, and infinitely customizable prompt for any shell! References: [1]: https://github.com/starship/starship [2]: https://github.com/starship
alttch [1] has done a fantastic job with rapidtables [2]. Highly recommend taking a look. Super fast list of dicts to pre-formatted tables conversion library for Python 2/3 References: [1]: https://github.com/alttch [2]: https://github.com/alttch/rapidtables

📝 Bash Notes

Bash is super powerful. File System Full # [1] Show Remaining Space on Drives df -h show largest files in current directory du . -h --max-depth=1 Move files then symlink them mkdir /mnt/mounted_drive mv ~/bigdir /mnt/mounted_drive ln -s /mnt/mounted_drive/bigdir ~/bigdir Fuzzy One Liners # [2] a() {source activate "$(conda info --envs | fzf | awk '{print $ edit in vim vf() { fzf | xargs -r -I % $EDITOR % ;} cat a file vf() { fzf | xargs -r -I % $EDITOR % ;} bash execute bf() { bash "$(fzf)" } git [3] add gadd() { git status -s | fzf -m | awk '{print $2}' | xargs git add && git status -s} git reset greset() { git status -s | fzf -m | awk '{print $2}' |xargs git reset && git status -s} Kill a process fkill() {kill $(ps aux | fzf | awk '{print($2)}')} Finding things # [4] Files # [5] fd-find [6] is amazing for finding files, it even respects your .gitignore file 😲. Install with apt install fd-find. fd md ag -g python find . -n "*.md" ++Vanilla Bonus Content # [7] ** sh...

Autoreload in Ipython

I have used %autoreload for several years now with great success and 🔥 rapid reloads. It allows me to move super fast when developing libraries and modules. They have made some great updates this year that allows class modules to be automatically be updated. What I like about autoreload # [1] 🔥 Blazing Fast 💥 Keeps me in the comfort of my text editor 👏 Allows me to use Jupyter when I need 👟 Extremely Reliable One of the biggest benefits that I find is that it shortens the distance between my module/library code and test code inside of a terminal/notebook. Now I primarily use jupyter notebooks for the presentation aspect. I develop code from the comfort of my editor with all of the tools I have setup, and run the functions in a notebook to get the output. From there I might do some aggregations or plots, but the 🥩 meat of development is done outside of jupyter. Now I primarily use jupyter notebooks for the presentation aspect. Enabling Autoreload # [2] 📐 config This is a sh...
3 min read
If you’re into interesting projects, don’t miss out on psutil [1], created by giampaolo [2]. Cross-platform lib for process and system monitoring in Python References: [1]: https://github.com/giampaolo/psutil [2]: https://github.com/giampaolo
If you’re into interesting projects, don’t miss out on promote-open-source-project [1], created by zenika-open-source [2]. 📄 How to promote my open source project? References: [1]: https://github.com/zenika-open-source/promote-open-source-project [2]: https://github.com/zenika-open-source
Check out watchtower [1] by kislyuk [2]. It’s a well-crafted project with great potential. Python CloudWatch Logging: Log Analytics and Application Intelligence References: [1]: https://github.com/kislyuk/watchtower [2]: https://github.com/kislyuk
I recently discovered arrow [1] by apache [2], and it’s truly impressive. Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics References: [1]: https://github.com/apache/arrow [2]: https://github.com/apache
Just starred shell-functools [1] by sharkdp [2]. It’s an exciting project with a lot to offer. Functional programming tools for the shell References: [1]: https://github.com/sharkdp/shell-functools [2]: https://github.com/sharkdp

Keyboard Driven VSCode

Throw that mouse Away its time to setup some keyboard shortcuts. These sortcuts were the baseline for switching from tmux/vim to vscode. Most folks posts I was able to find gave great tips on replacing vim, but very few have focused on the hackability of tmux. tmux allows me to rapidly fire up a workspace, create new windows and splits. Then When I switch tasks I can leave that workspace open and and jump right back in later exactly where I left off. There is nothing quite like it. The shortcuts listed here make the transition a bit better. The worst thing I found when using vscode at first was no way to switch between the terminal and editor without the mouse. This first set of keybindings solve that issue. The worst thing I found when using vscode at first was no way to switch between the terminal and editor without the mouse. !!! see-also I have an updated article in my tmux workflow How I navigate tmux in 2021 [1] Alt+[hjkl] # [2] navigation ⬅ jump to left split alt+h ⬇ j...
Looking for inspiration? Jupyter-Atom-Dark-Theme [1] by burglarbenson [2]. A dark theme for Jupyter Lab References: [1]: https://github.com/burglarbenson/Jupyter-Atom-Dark-Theme [2]: https://github.com/burglarbenson
tarpas [1] has done a fantastic job with pytest-testmon [2]. Highly recommend taking a look. Selects tests affected by changed files. Executes the right tests first. Continuous test runner when used with pytest-watch. References: [1]: https://github.com/tarpas [2]: https://github.com/tarpas/pytest-testmon
If you’re into interesting projects, don’t miss out on vim-flog [1], created by rbong [2]. A blazingly fast, stunningly beautiful, exceptionally powerful git [3] branch viewer for Vim/Neovim. References: [1]: https://github.com/rbong/vim-flog [2]: https://github.com/rbong [3]: /glossary/git/
I like mcfunley’s [1] project pugsql [2]. A HugSQL-inspired database library for Python References: [1]: https://github.com/mcfunley [2]: https://github.com/mcfunley/pugsql
I like ggreer’s [1] project the_silver_searcher [2]. A code-searching tool similar to ack, but faster. References: [1]: https://github.com/ggreer [2]: https://github.com/ggreer/the_silver_searcher

Realistic Git Workflow

My git [1] workflow based on real life. Its not always clean and simple. sometimes things get messy The Clean Path # [2] [3] pull 👉 branch 👉 format 👉 work👉 add 👉 commit 👉 pull 👉 rebase 👉 push Pull # [4] As complicated as that seems it is pretty straight forward. When you sit down to work the first thing you do is to pull down the teams latest working “develop” branch from git. git checkout develop git pull Branch # [5] Next create a new branch with a name that will remind you of what you are working on. For your own sanity choose something descriptive. It is easy to get too many similar branches going and forget which branch is which. git checkout -b ingest_product_id_table Format # [6] If you know which files in existance that you will be editing before you start work it is a good idea to format them in a commit early on to keep your working commits separate from formatting. This will make it easier for reviewers to distinguish from your changes and formatting fixes. ...
7 min read
Just starred kedro [1] by kedro-org [2]. It’s an exciting project with a lot to offer. Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular. References: [1]: https://github.com/kedro-org/kedro [2]: https://github.com/kedro-org
Check out forestryio [1] and their project forestry.io [2]. Forestry.io website References: [1]: https://github.com/forestryio [2]: https://github.com/forestryio/forestry.io
Check out maildown [1] by chris104957 [2]. It’s a well-crafted project with great potential. A super simple CLI for sending emails References: [1]: https://github.com/chris104957/maildown [2]: https://github.com/chris104957