Find all Headings with BeautifulSoup ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BeautifulSoup is a DOM like library for python. It's quite useful to manipulate html. Here is an example to find_all html headings. I stole the regex from... Date: February 1, 2022 BeautifulSoup is a DOM like library for python. It’s quite useful to manipulate html </html/>. Here is an example to find_all html headings. I stole the regex from stack overflow, but who doesn’t. Make an example ─────────────── sample.html Lets make a sample.html file with the following contents. It mainly has some headings, 

 and 

 tags that I want to be able to find. [code]

hello

this is a paragraph

second heading

this is also a paragraph

third heading

this is the last paragraph

Get the headings with BeautifulSoup ─────────────────────────────────── Lets import our packages, read in our sample.html using pathlib and find all headings using BeautifulSoup. [code] from bs4 import BeautifulSoup from pathlib import Path soup = BeautifulSoup(Path('sample.html').read_text(), features="lxml") headings = soup.find_all(re.compile("^h[1-6]$")) And what we get is a list of bs4.element.Tag’s. [code] >> print(headings) [

hello

,

second heading

,

third heading

] I recently added a heading_link plugin to markata, you might notice the 🔗’s next to each heading on this page, that is powered by this exact technique.