You can easily extract text from an HTML page using any of the famous HTML parsing libraries in Python. Here is an example of extracting text using BeautifulSoup's get_text()
method:
from bs4 import BeautifulSoup
soup = BeautifulSoup("""
<body>
<h1 class="product">Product Details</h1>
<div class="details">
<div>Remaining Stock</div>
<div>5</div>
</div>
</body>
""")
body = soup.find('body')
body_text = body.get_text()
print(body_text)
It will produce the following output:
Product Details
Remaining Stock
5
Selenium also offers something similar. You can use the .text
property of an HTMLElement
to extract text from it.