Published
All published posts
Just starred python-interrogate-check by JackMcKew. It’s an exciting project with a lot to offer.
GitHub Action for use with python package interrogate
How to find things in your kedro catalog
kedro 0.16.2 just dropped last week with a long-awaited feature… catalog search! I went as far as monkey patching this into each of my projects. I work jump between a few really big projects that have tons of datasets. Being able to quickly search for what I need is so useful.
The kedro data catalog is a key component to the kedro framework. It handles all data loading and saving for you. It is configurable and hackable. Having all your data connections listed in one place make it so easy to pick your project up and move it to a completely new environment. That sweet imperative loading style saves so much read/write overhead. I can load all my data with a single command whether it’s in amazon s3, google cloud platform, or a local file.
Just like with most of these articles, I am going to create a conda environment so that I don’t break any existing projects and scaffold up a toy project to learn from.
...
Check out davidesantangelo and their project datoji.
A tiny JSON storage service. Create, Read, Update, Delete and Search JSON data.
My first eight years as a working professional.
This day 8 years ago I started my first day as a Mechanical Engineer. I am so grateful for this journey that I have been able to have. There is no way that I could have planned this journey from the beginning.
My initial career plans were down a completely different path. I have been very flexible in taking on a new career path. I have been eager to learn new things and respond to life changes that I never would have imagined.
Very severe chronic health issues from my family restricted my ability to travel to the facilities I served as a Mechanical Engineer. I was able to stay strong and make it work. But in the meantime, I was learning new skills that enabled me to be more effective remotely.
...
How Kedro handles your inputs
Passing inputs into kedro is a key concept. Understanding how it accepts a single catalog key as input is quite trivial that easily makes sense, but passing a list or dictionary of catalog entries can be a bit confusing.
Check out this post for a review of how *args **kwargs work in python.
understanding python *args and **kwargs
...
I came across kedro-great from tamsanh, and it’s packed with great features and ideas.
The easiest way to integrate Kedro and Great Expectations
I came across awesome-public-datasets from awesomedata, and it’s packed with great features and ideas.
A topic-centric list of HQ open datasets.
Master No More
It’s been a long time coming. We use some very harsh language within tech so much sometimes that we become numb to it. It’s time to do my very small part in this movement and purge this language from my active repos starting with this blog right here.
Large Refactor At The Command Line
this post follows my method of refactoring code bases from the command line, read more about that in this article.
...
Refactoring your blog urls
I just did a quick refactoring of my JAMStack blog urls. Some didn’t fit with my style, some had _ that I wanted to switch to -, and others were ridiculously long. I’ve been using forestry as my CMS, I write many of my posts there, and sometimes it picks some crazy file names (based on my titles). It was time to refactor.
Large Refactor At The Command Line
When refactorings similar to this get really big I often need to do some project wide find an replace, I usually do this right from the...
...
understanding python \*args and \*\*kwargs
Python *args and **kwargs are super useful tools, that when used properly can make you code much simpler and easier to maintain. Large manual conversions from a dataset to function arguments can be packed and unpacked into lists or dictionaries. Beware though, this power can lead to some really unreadable/unusable code if done wrong.
Python *args and **kwargs are super useful tools, that when used properly can make you code much simpler and easier to maintain. Large manual conversions from a dataset to function arguments can be packed and unpacked into lists or dictionaries. Beware though, this power can lead to some really unreadable/unusable code if done wrong.
*args are some magical syntax that will collect function arguments into a list, or unpack a list into individual arguments.
...
I recently discovered pytest-sugar by Teemu, and it’s truly impressive.
a plugin for py.test that changes the default look and feel of py.test (e.g. progressbar, show tests that fail instantly)
the-hub by ari-hacks is a game-changer in its space. Excited to see how it evolves.
📈📊 A hub where users can experiment with graphing and Python in the browser (https://pyodide-experiment.herokuapp.com/)
pre-commit is awesome
I recently discovered the ✨ awesomeness that is pre-commit. I steered away from it for so long because it seemed like a big daunting thing to set up, but really it’s easy. It will automatically run checks for you. In some cases, it will even automatically fix them for you. Out of the box, it will do things like automatically trim extra whitespace, fix file endings, and ensure file sizes are not too large for git.
I recently discovered the ✨ awesomeness that is pre-commit. I steered away from it for so long because it seemed like a big daunting thing to set up, but really it’s easy. It will automatically run checks for you. In some cases, it will even automatically fix them for you. Out of the box, it will do things like automatically trim extra whitespace, fix file...
...
Building kedro.dev
Follow along the Journey as I build out kedro.dev.
I have really enjoyed my own personal journey as I have started to build all of my data pipeline projects with the kedro framework. I want to start building a place to share resources with the community. I want to see this community grow and flourish. They say in front end web development if you are not using a framework you end up building one. That’s exactly what I was doing before I started using kedro. I want to build out a set of resources that this community can learn from and start to use the framework at their own pace without needing to develop their own from scratch.
Looking into the front end frameworks to see how they welcome their community. Much of my inspiration is from them, bringing lessons learned to data.
...
The work on desert by python-desert.
Deserialize to objects while staying DRY
I recently discovered kedro-wings by tamsanh, and it’s truly impressive.
Kedro Wings automatically creates catalog entries to simplify Kedro pipeline writing. See the video here: https://www.youtube.com/watch?v=p4ELo1tqbYY
Check out kedro-streaming-twitter-pipeline by dataengineerone. It’s a well-crafted project with great potential.
No description available.