You can find HTML elements by class via multiple ways in Python. The method you choose will depend on the library you are using. Some of the most famous libraries that allow selecting HTML elements by class are BeautifulSoup
and Selenium
.
You can use the find
or find_all
methods of BeautifulSoup and pass in a class_
argument to match elements with a particular class. This is how it will look like:
import re
import requests
from bs4 import BeautifulSoup
html = requests.get("https://scrapingbee.com").text
soup = BeautifulSoup(html)
print(soup.find("h1", class_="mb-33"))
# Output: <h1 class="mb-33">Tired of getting blocked while scraping the web?</h1>
Alternatively, you can use XPath selectors in Selenium to do the same this. Here is some sample code:
from selenium import webdriver
from selenium.webdriver.common.by import By
DRIVER_PATH = '/path/to/chromedriver'
driver = webdriver.Chrome(executable_path=DRIVER_PATH)
# Open Scrapingbee's website
driver.get("https://www.scrapingbee.com")
# Get the first h1 element using find_element
h1 = driver.find_element(By.XPATH, "//h1[contains(@class, 'mb-33')]")
print(h1.text)
# Output: 'Tired of getting blocked while scraping the web?'