You can use Selenium to scrape data from specific elements of a web page. Let's take the same example from our previous post: How to web scrape with python selenium?
We have used this Python code (with Selenium) to wait for the content to load by adding some waiting time:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options, executable_path="PATH_TO_CHROMEDRIVER") # Setting up the Chrome driver
driver.get("https://demo.scrapingbee.com/content_loads_after_5s.html")
time.sleep(6) # Sleep for 6 seconds
print(driver.page_source)
driver.quit()
And we've had this result:
<!DOCTYPE html>
<html>
...
<div id="content">This is content</div>
...
</html>
Now, we can further improve our code to extract the content itself without having to load the whole HTML code. To do that, we can run this code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options, executable_path="PATH_TO_CHROMEDRIVER") # Setting up the Chrome driver
driver.get("https://demo.scrapingbee.com/content_loads_after_5s.html")
time.sleep(6) # Sleep for 6 seconds
element = driver.find_element(By.ID, 'content')
print(element.text)
driver.quit()
And the result will be: This is content
instead of the page's HTML code.
For more information about Python & Selenium, make sure to check this thorough blog article: Web Scraping using Selenium and Python