MissingSchema
occurs when you don't provide the complete URL to requests
. This often means you skipped http://
or https://
and/or provided a relative URL.
You can fix this error by making use of the urljoin
function from the urllib.parse
library to join URLs before making a remote request. The solution will look something like this:
from urllib.parse import urljoin
import requests
url = "https://scrapingbee.com"
relative_url = "/path/to/resource"
final_url = urljoin(url, relative_url)
html = requests.get(final_url)
urljoin
will merge two URLs only if the second argument is a relative path. For example, the following sample code will print https://scrapingbee.com
:
from urllib.parse import urljoin
first_url = "https://google.com"
second_url = "https://scrapingbee.com"
final_url = urljoin(first_url, second_url)
print(final_url)