-
💭 hotel_bookings.csv
Here's my thought on 💭 hotel_bookings.csv nice dataset to use for example / test projects. I'm using it to play with duckdb currently. !!! note This post is…
-
💭 Choosing color palettes — seaborn 0.13.2 documen...
Here's my thought on 💭 Choosing color palettes — seaborn 0.13.2 documentation Good overview of seaborn color palettes. They have all sorts of different…
-
💭 Proper handling of None in WHERE condition · Iss...
Here's my thought on 💭 Proper handling of None in WHERE condition · Issue #109 · fast... SQLModel models ship with an , and that you can use to compare to…
-
💭 Aaron Francis on X: "📣 We're excited to announce...
Here's my thought on 💭 Aaron Francis on X: "📣 We're excited to announce Mastering Pos... Aaron Francis is a database master, pumped for thsi dude and all…
-
💭 Database Remote-Copy Tool For SQLite (draft)
Here's my thought on 💭 Database Remote-Copy Tool For SQLite (draft) Simon shared a really cool new utility tool for sqlite ispired by rsync. It checks…
-
💭 Sqlite-jiff
Here's my thought on 💭 Sqlite-jiff Sqlite is getting rust extensions now, and datetimes make it totally worth if if they work well and and fast, two things…
-
💭 sql - How can I list the tables in a SQLite data...
Here's my thought on 💭 sql - How can I list the tables in a SQLite database file that... I learned about the sqlite_master table from this stack overflow…
-
💭 nalgeon/redka: Redis re-implemented with SQLite
Here's my thought on 💭 nalgeon/redka: Redis re-implemented with SQLite Redka a sick new redis compatable api, that uses sqlite as its backend datastore. It…
-
sqlite vacuum
Today I learned how to VACUUM a sqlite database and cut its size in about half. It's a database that I have had running for quite awhile and has some decent traffic on it. Why is it important to do a VACUUM? In short its becuase the file system gets fragmented with as data is updated. On delete the files are removed from the database and marked as available for reuse in the filesystem, but the space is not reclaimed. To VACUUM a database, run the following sql command. You can do it right form
-
💭 sql - SQLite: COUNT slow on big tables - Stack O...
Here's my thought on 💭 sql - SQLite: COUNT slow on big tables - Stack Overflow Another interesting option for slow count queries in sqlite. If you haven't…
-
💭 Optimizing SQLite for servers
Here's my thought on 💭 Optimizing SQLite for servers Very interesting article by Sylvain, suggested by Simon Willison. Definitely some things that I want to…
-
💭 learning strawberry
Here's my thought on 💭 learning strawberry !!! note This post is a thought . It's a short note that I make about someone else's content online. Learn more…
-
💭 searching my thoughts locally
Here's my thought on 💭 searching my thoughts locally First I need to fetch my thoughts from the api, and put it in a local sqlite database using . Now that…
-
💭 Creating One-To-Many Relationships in Flask-SQLA...
Here's my thought on 💭 Creating One-To-Many Relationships in Flask-SQLAlchemy - YouTube Great example from Anthony showing how easy it is to practice…
-
💭 python - Concepts of backref and back_populate i...
Here's my thought on 💭 python - Concepts of backref and back_populate in SQLalchemy? ... Today I came across some sqlalchemy models that created some…
-
💭 Read a Range of Data - LIMIT and OFFSET - SQLMod...
Here's my thought on 💭 Read a Range of Data - LIMIT and OFFSET - SQLModel Today I was running some sqlmodel queries through the sqlalchemy orm. Admittedly…
-
💭 Open source, not open contribution with Ben John...
Here's my thought on 💭 Open source, not open contribution with Ben Johnson (Changelog... Ben Johnson was on the Changelog a few years back covering his work…
-
💭 DjangoCon Europe 2023 | Use SQLite in production...
Here's my thought on 💭 DjangoCon Europe 2023 | Use SQLite in production - YouTube Very inspiring talk, TLDR, you probably don't need a database server.…
-
Set up minio bucket entrypoint
I recently se tup minio object storage in my homelab for litestream sqlite backups. The litestream quickstart made it easy to get everything up and running on localhost, but I hit a wall when dns was involved to pull it from a different machine. Here is what I got to work First I had to configure the Key ID and Secret Access Key generated in the minio ui. Then set the the s3 signature_version to s3v4. Now when I have minio running on https://my-minio-endpoint.com I can use the aws cli to acces
-
💭 benbjohnson/litestream: Streaming replication fo...
Here's my thought on 💭 benbjohnson/litestream: Streaming replication for SQLite. install Install is fast using installer, no compilation, just copy the…
-
why-is-postgres-default
Serious question. No one ever got fired for choosing PostgreSQL But, why. It's the most loved db, right? Right? Maybe it's time to rethink it. Don't get me wrong, if I need a relational db as a service, PostgreSQL is going to be my first choice, but why do I need to run a separate application for it? Tutorials use sqlite Why is that? Because there is nothing else to stand up. Nothing else to maintain. And you probably already have it installed on just about anything that has a battery. SQLite ru
-
💭 Why I Built Litestream - Litestream
Here's my thought on 💭 Why I Built Litestream - Litestream As applications scale to the edge, to put compute as close to the user as possible, database…
-
💭 I'm All-In on Server-Side SQLite · The Fly Blog
Here's my thought on 💭 I'm All-In on Server-Side SQLite · The Fly Blog SQLite is the next big database trend. with more horizontal scaling, close to user…
-
💭 LiteFS Cloud: Distributed SQLite with Managed Ba...
Here's my thought on 💭 LiteFS Cloud: Distributed SQLite with Managed Backups · The Fl... Fly.io's solution to sqlite managed backups.I definitely want to…
-
💭 SQLite FTS5 Extension
Here's my thought on 💭 SQLite FTS5 Extension sqlite has 3 different tokenizers, . These can be used with sqlite-utils. And with the python api. !!! note…
-
💭 sqlite_utils Python library - sqlite-utils
Here's my thought on 💭 sqlite_utils Python library - sqlite-utils sqlite-utils is primarily a cli tool for sqlite operations such as enabling full text…
-
💭 simonw/datasette-render-markdown: Datasette plug...
Here's my thought on 💭 simonw/datasette-render-markdown: Datasette plugin for renderi... datasette really does everything doesn't it! !!! note This post is…
-
💭 `ValueError: Constraint must have a name` in ale...
Here's my thought on 💭 in alembic 1.10.0 · ... After a nasty time with alembic upgrades, thoughts is about to get a new users table. This may have came…
-
💭 Use Alembic Check to check for possible upgrades
Here's my thought on 💭 Use Alembic Check to check for possible upgrades Since using alembic I have been just running out a new revision checking its content…
-
💭 DuckDB vs. MotherDuck — should you switch to the...
Here's my thought on 💭 DuckDB vs. MotherDuck — should you switch to the cloud version... duckdb is a new in process database that has been making its rounds…
-
💭 s3-tree · PyPI
Here's my thought on 💭 s3-tree · PyPI Super useful way to show a tree view of an s3 bucket's structure! !!! note This post is a thought . It's a short…
-
💭 python - SQLAlchemy ORDER BY DESCENDING? - Stack...
Here's my thought on 💭 python - SQLAlchemy ORDER BY DESCENDING? - Stack Overflow How to sort results from a sqlalchemy based orm. I needed this to enable…
-
💭 kndndrj/nvim-dbee: Interactive database client f...
Here's my thought on 💭 kndndrj/nvim-dbee: Interactive database client for neovim A neovim database client that I need to check out. !!! note This post is a…
-
💭 sqlite-utils now supports plugins
Here's my thought on 💭 sqlite-utils now supports plugins As the title states sqlite-utils now supports plugins. I dug in just a bit and Simon implemented…
-
💭 Column INSERT/UPDATE Defaults — SQLAlchemy 1.4 D...
Here's my thought on 💭 Column INSERT/UPDATE Defaults — SQLAlchemy 1.4 Documentation sqlalchemy server_defaults end up as defaults in the database when new…
-
💭 Harlequin SQL IDE - DuckDB
Here's my thought on 💭 Harlequin SQL IDE - DuckDB Harlequin is a pretty sweet example of what textual can be used to create. Its a terminal based sql ide…
-
💭 Python API - DuckDB
Here's my thought on 💭 Python API - DuckDB To persist data in duckdb you need to first make a connection to a duck db database. Then work off of the…
-
💭 SQL on Pandas - DuckDB
Here's my thought on 💭 SQL on Pandas - DuckDB duckdb can just query any pandas dataframe that is in memory. I tried running it against a list of objects and…
-
💭 Filter Data - WHERE - SQLModel
Here's my thought on 💭 Filter Data - WHERE - SQLModel When fetching pydantic models from the database with sqlmodel, and you cannot select your item by id,…
-
💭 Full-text search - Datasette documentation
Here's my thought on 💭 Full-text search - Datasette documentation Enable full-text search in sqlite using sqlite-utils. !!! note This post is a thought .…
-
💭 sqlite-utils command-line tool - sqlite-utils
Here's my thought on 💭 sqlite-utils command-line tool - sqlite-utils I want to like jq, but I think Simon is selling me on sqlite, maybe its just me but this…
-
💭 sqlite-utils command-line tool - sqlite-utils
Here's my thought on 💭 sqlite-utils command-line tool - sqlite-utils insert a json array directly into into sqlite with sqlite-utils. !!! note This post is…
-
JUT | Read Notebooks in the Terminal
Trying to read a .ipynb file without starting a jupyter server? jut has you covered. https://youtu.be/t8AvImnwor0 watch the video version of this post on YouTube install is packaged and available on pypi so installing is as easy as pip installing it. installing jut with pip ! This is my first time including snippets of the video in the article like this, let me know what you think! examples running jut examples what are all the commands available for jut? Take a look at the help of the cli
-
Minimal Kedro Pipeline
How small can a minimum kedro pipeline ready to package be? I made one within 4 files that you can pip install. It's only a total of 35 lines of python, 8 in and 27 in . 📝 Note this is only a composable pipeline, not a full project, it does not contain a catalog or runner. Minimal Kedro Pipeline I have everything for this post hosted in this gihub repo , you can fork it, clone it, or just follow along. Installation Caveats This repo represents the minimal amount of structure to build a ked
-
Kedro - My Data Is Not A Table
In python data science/engineering most of our data is in the form of some sort of table, typically a DataFrame from a library like pandas, spark, or dask. DataFrames are the heart of most pipelines These containers for data contain many convenient methods to manipulate table like data structures. Sometimes we leverage other data types, namely vanilla types like lists and dicts, or even numpy data types. [[ what-is-kedro ]] unfamiliar with kedro, check out this post Sometimes datasets are not t
-
Gracefully adopt kedro, the catalog
Why use kedro catalog? While using the catalog alone will not reap all of the benefits of the framework, it does get you and your project ready for the full framework eventually. For me the full benefit of the catalog comes when you combine it with the pipeline and dont even touch read/write steps at all. Taking a step into kedro by adopting the catalog first will give you a way to organize all of your data loads in one place, and stop manually writing read/write code, which can be different fo
-
How to find things in your kedro catalog
kedro 0.16.2 just dropped last week with a long-awaited feature... catalog search ! I went as far as monkey patching this into each of my projects. I work jump between a few really big projects that have tons of datasets. Being able to quickly search for what I need is so useful. The Catalog The kedro data catalog is a key component to the kedro framework. It handles all data loading and saving for you. It is configurable and hackable. Having all your data connections listed in one place
-
How Kedro handles your inputs
Passing inputs into kedro is a key concept. Understanding how it accepts a single catalog key as input is quite trivial that easily makes sense, but passing a list or dictionary of catalog entries can be a bit confusing. *args/**args review Check out this post for a review of how work in python. [[ python-args-kwargs ]] python args and kwargs article by @_waylonwalker All Kedro inputs are catalog Entries When kedro runs your pipeline it uses the catalog to imperatively load your data, mea
-
Create Custom Kedro Dataset
Kedro provides an efficient way to build out data catalogs with their yaml api. It allows you to be very declaritive about loading and saving your data. For the most part you just need to tell Kedro what connector to use and its filepath. When running Kedro takes care of all of the read/write, you just reference the catalog key. But what is happening behind the scenes Under the hood there is an that each connector inherits from. It sets up a lot of the behind the scenes structure for us so
-
What is YOUR Advice for New Data Scientists
Learn the business Learn Git Your code does not need to be amazing Keep Learning Learn Git You dont have to start out as a git wizard with the cleanest possible commit history. At first dont let yourself get too wrapped up in it, the most important part is that you make commits. You will find needs for more advanced stuff later. Get comfortable with this, then learn how to , , , etc... Your code does not need to be amazing Get the job done. Keep it in small bite size pieces. Make readable
-
Filtering Pandas
query Good for method chaining, i.e. adding more methods or filters without assigning a new variable. masking general purpose, this is probably the most common method you see in training/examples isin capable of including multiple strings to include contains Good For partial matches MASKS anything that we put inside of square brackets can be set as a variable then passed in. Operators & - and ~ - not | - or AVAILABLE and NAME AVAILABLE or NAME AVAILABLE and not NAME
-
Clean up Your Data Science with Named Tuples
If you are a regular listener of TalkPython or PythonBytes you have hear Michael Kennedy talk about Named Tuples many times, but what are they and how do they fit into my data science workflow. Example As you graduate your scripts into modules and libraries you might start to notice that you need to pass a lot of data around to all of the functions that you have created. For example if you are running some analysis utilizing , , and data. You may need to calculate total revenue, inventory
-
Background Tasks in Python for Data Science
This post is intended as an extension/update from background tasks in python . I started using the week that Kenneth Reitz released it. It takes away so much boilerplate from running background tasks that I use it in more places than I probably should. After taking a look at that post today, I wanted to put a better data science example in here to help folks get started. This post is intended as an extension/update from background tasks in python . I started using the week that Kenneth R