If you've ever needed to grab files in bulk, you know the pain of clicking download links one by one. That's where combining Python and wget shines. Instead of re-implementing HTTP requests yourself, you can call the battle-tested wget
tool straight from a Python script and let it handle the heavy lifting.
In this guide, we'll set up wget
, explain how to run it from Python using subprocess, and walk through three copy-paste commands that cover almost everything you'll ever need: downloading a file, saving it with a custom name or folder, and resuming interrupted transfers. Let's get started!

Quick answer (TL;DR)
If you just need the basics, here's the fast track.
1. Install wget
# Ubuntu/Debian
sudo apt-get update && sudo apt-get install -y wget
# macOS
brew install wget
# Windows (choco)
choco install wget
2. Write a Python script to use wget with subprocess
import subprocess, os, time
# URL to download
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
# Here are a few common approaches:
## 1. Basic download
subprocess.run(["wget", url])
## 2. Custom filename + folder
os.makedirs("downloads", exist_ok=True)
subprocess.run(["wget", "-O", "downloads/myfile.zip", url])
## 3. Resume interrupted download
subprocess.run(["wget", "--continue", url])
That's it:
- Grab the file with the default name.
- Save it as
downloads/myfile.zip
. - Resume if the file is already partially downloaded.
Why use wget?
wget
is a cross-platform command-line tool for downloading files over HTTP, HTTPS, and FTP. It's been around for decades, and developers still reach for it because it's reliable, flexible, and dead simple to use.
Two features make it stand out:
- Robustness — if your connection drops,
wget
can pick up right where it left off instead of starting over from zero. - Recursive downloads — with the right flags,
wget
can follow links on a page and slurp down entire directory trees, not just single files.
That combo makes it perfect for scripting and automation, especially when you pair it with Python.
Installing and setting up wget for Python scripts
Before we start automating downloads, there are two pieces you need in place:
- Python — where we'll write the automation logic.
- wget — the star of the show, a command-line tool for fetching files over HTTP, HTTPS, and FTP.
Since wget
is an external program and not a Python package, we'll call it from Python using the built-in subprocess
module. Think of it like typing the same command in your terminal, just wrapped in Python code. This keeps things simple: no need to reinvent downloading when wget
already does the heavy lifting.
Setting up Python 3
First, check you're on Python 3.10 or later:
python --version
pip --version
(Some systems use python3 and pip3 instead.)
To avoid version headaches, I really recommend managing projects with uv or a similar solution like Poetry. To prepare a new uv project, simply run:
uv init python-wget-demo
cd python-wget-demo
uv run python --version
Create a main.py
file in your project:
def main():
# our code goes here
if __name__ == "__main__":
main()
You can run this script easily with or without uv:
python main.py
uv run python main.py
Nice, let's move on!
Python subprocess module
Python comes with the subprocess
module by default. It's the standard way to execute external commands from your code. For example:
import subprocess
def main():
subprocess.run(["ping", "127.0.0.1"])
# For brevity I'll skip this boilerplate in later examples
if __name__ == "__main__":
main()
Running this script gives you something like:
Pinging 127.0.0.1 with 32 bytes of data:
Reply from 127.0.0.1: bytes=32 time<1ms TTL=64
Well it's definitely not the most useful program in the world as it simply pings your own computer. But at least it shows that Python can successfully launch system commands through subprocess.
⚠️ Quick tip from someone who's been around IT for a while (yep, that's me). Be careful with scripts that call subprocess
, especially if they seem obscure (regardless of the language or technology). It's a direct bridge to your shell, so if someone hides a malicious command inside, your computer will happily execute it. A friend of mine once ran a Node.js app he found online, and it quietly executed commands that scraped his browser data and crypto wallet files. No warnings, no popups — and his money was gone.
So the rule of thumb: never run unknown code that uses something simialar to subprocess.run()
unless you trust the source and understand what the command is doing. Other languages have the same sharp edge; in Node.js, for example, it’s child_process.exec()
. Okay, paranoid mode off.
Installing wget on Linux, macOS, and Windows
Next, install wget itself. Use the command for your OS:
Ubuntu / Debian:
sudo apt-get update && sudo apt-get install -y wget
Fedora:
sudo dnf install -y wget
macOS (Homebrew):
brew install wget
Windows (choco):
choco install wget
(wget
should be available in other package managers for Windows.)
Verifying wget installation
Confirm that wget
is installed and available on your PATH
:
wget --version
Sample output might look like:
GNU Wget 1.21.4 built on linux-gnu.
On Unix-like systems, check the exit code:
echo $?
# 0
Exit code 0
means the command succeeded. If you see a different number, it usually indicates an error in installation or PATH
setup.
💡 Tip: New to scraping in general? Check out Python Web Scraping: Full Tutorial With Examples (2025). It covers the fundamentals before you dive into automating downloads with wget.
3 simple Python wget commands explained
With everything installed, let's get wget to work. Okay, that was a weird pun. Anyways. In practice, most automation scripts only need a handful of patterns, and you can cover 90% of real-world cases with three simple commands:
- Download a file using
subprocess.run(["wget", url])
- Save with a custom filename using the
-O
flag or choose a folder using-P
- Resume downloads with
-c
We'll go through them one by one with copy-paste Python snippets you can drop straight into your project. After that, we'll look at a few extra wget
tricks so you've got more than just the basics covered.
1. Download a file using subprocess.run
The simplest way to use wget
is to grab a single file. If you were doing this directly in your terminal, the command would look like:
wget http://example.com/archive.zip
It fetches a ZIP archive and saves it to the current folder.
To run the same command from Python, wrap it with subprocess.run()
:
import subprocess
def main():
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
result = subprocess.run(["wget", url])
if result.returncode == 0:
print("Download completed successfully.")
else:
print(f"Download failed with exit code {result.returncode}")
A few important things to note here:
Stdout and stderr
Every program you run on your computer can print text to two different "channels":
stdout
("standard output") is for normal messages, like progress bars or results.stderr
("standard error") is for error messages, like "file not found" or "connection failed."
Think of it like the Witcher's two swords: one for monsters, one for humans. Both are important, just for different kinds of battles. So, when you run wget
in a terminal, both channels are automatically displayed on screen. That's why you see progress updates or error messages right away, without needing extra Python code to capture them.
If you do want to capture these messages inside Python (for logging or debugging), you can add parameters like capture_output=True
or stderr=subprocess.PIPE
. But for now, letting them print directly is simplest.
Exit code
Every time a program finishes running, it gives the operating system a small number called an exit code. In fact, we've already seen it a few moments ago when installing wget
. By convention:
0
means success (everything worked as expected).- Any other number means some kind of failure.
For example, 4
might mean a network failure, 8
could mean a server error, etc.
In Python, you get this number via result.returncode
. Checking it lets your script decide what to do if a download fails, instead of silently moving on.
Large files
One of the reasons wget
is still popular: it streams data straight to disk as it downloads. That means if you grab a 2GB file, it doesn't try to hold the entire thing in RAM. It writes chunks to disk as they arrive.
In Python's requests
or urllib
libraries, you'd have to handle this logic yourself, but with wget
it's built in. So whether it's a small test ZIP or a huge dataset, the command works the same way.
Here's the sample output for our small program:
--2025-09-23 16:53:24-- https://getsamplefiles.com/download/zip/sample-1.zip
Resolving getsamplefiles.com (getsamplefiles.com)... 104.21.9.51, 172.67.141.191
Connecting to getsamplefiles.com (getsamplefiles.com)|104.21.9.51|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1099121 (1.0M) [application/zip]
Saving to: ‘sample-1.zip'
sample-1.zip 100%[=================================================>] 1.05M 6.64MB/s in 0.2s
2025-09-23 16:53:25 (6.64 MB/s) - ‘sample-1.zip' saved [1099121/1099121]
Download completed successfully.
That's all it takes to get a basic Python wget download running in your script. Now let's step it up and see how to control the filename rather than relying on whatever the server decides to use.
2. Save with a custom filename using -O
or choose folder with -P
By default, wget
saves the file with whatever name the server provides and drops it into the current directory. That's fine for quick tests, but in real scripts you'll usually want more control over where the file goes and what it's called.
There are two different flags for that but keep in mind they don't combine:
-O <file>
: forces a specific filename.-P <dir>
: chooses the folder but keeps the server's filename.
Let's see them in action.
⚠️ Note: If you try to use
-O
and-P
together, wget ignores-P
. If you really want both custom name and folder, just give-O
a full path likedownloads/dataset.zip
.
Using -O
for a custom name
On the command line:
wget -O dataset.zip https://getsamplefiles.com/download/zip/sample-1.zip
In Python:
import subprocess
def main():
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
custom_name = "dataset.zip"
result = subprocess.run(["wget", "-O", custom_name, url])
if result.returncode == 0:
print(f"Saved file as {custom_name}")
else:
print("Download failed")
The idea is the same as in the script above, we've just added a new command line parameter. Here wget
will always save the file with that exact name in the current working directory.
Note on the overwrite behavior:
If a file with the same name already exists, wget
will overwrite it without warning. To avoid silent overwrites, you can append a timestamp or suffix in Python:
import subprocess, time
def main():
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
timestamp = int(time.time())
custom_name = f"dataset_{timestamp}.zip"
subprocess.run(["wget", "-O", custom_name, url])
print(f"Saved as {custom_name}")
It's a pretty common trick. Another option is to check with os.path.exists()
before downloading.
Using -P
for a custom folder
On the command line:
wget -P downloads https://getsamplefiles.com/download/zip/sample-1.zip
This saves the file into a downloads/
folder, but keeps the original filename (sample-1.zip
).
In Python:
import subprocess, os
def main():
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
directory = "downloads"
os.makedirs(directory, exist_ok=True)
result = subprocess.run(["wget", "-P", directory, url])
if result.returncode == 0:
print(f"Saved into {directory}/")
else:
print("Download failed")
Why os.makedirs
?
Strictly speaking, you might not always need that line. Some builds of wget
(especially on Windows) will create the folder for you. But on some OSes, wget -P <dir>
might fail if the directory doesn't already exist.
So, calling os.makedirs(directory, exist_ok=True)
might feel like overkill, but it keeps the script portable and a bit safer.
3. Resume interrupted downloads with -c
/ --continue
Large downloads are fragile: a flaky Wi-Fi connection or a dropped VPN can kill them halfway. Instead of starting from scratch, wget
can pick up where it left off using the -c
(or --continue
) flag.
# start download, then interrupt with Ctrl+C
wget https://getsamplefiles.com/download/zip/sample-5.zip
# resume where it left off
wget --continue https://getsamplefiles.com/download/zip/sample-5.zip
In the terminal you should see:
HTTP request sent, awaiting response... 206 Partial Content
Length: 8402383 (8.0M), 5982607 (5.7M) remaining [application/zip]
If a partial file already exists, wget
checks its size and requests the missing bytes. This only works if the server advertises support for range requests (Accept-Ranges: bytes
header). If not, wget
will re-download the whole file.
Now let's check Python example:
import subprocess
def main():
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
result = subprocess.run(["wget", "--continue", url])
if result.returncode == 0:
print("Download completed (resumed if partial file existed).")
else:
print("Download failed or server does not support resume.")
Retrying automatically
For unreliable connections, you can wrap the wget --continue
call in a small retry loop. Here's a simple version that tries up to 3 times with a pause in between:
import subprocess, time
def download_with_retry(url, retries=3, delay=5):
for attempt in range(1, retries + 1):
print(f"Attempt {attempt}...")
result = subprocess.run(["wget", "--continue", url])
if result.returncode == 0:
print("Download completed.")
return True
else:
print(f"Failed (exit {result.returncode}). Retrying in {delay}s...")
time.sleep(delay)
return False
if __name__ == "__main__":
download_with_retry("https://getsamplefiles.com/download/zip/sample-1.zip")
This way, even if your network drops out, the script keeps retrying until the download is complete.
Boosting Python wget downloads with optional flags
The three core commands cover most cases, but wget
also ships with a ton of extra options. You don't need all of them — just a handful that improve stability and speed in automation scripts.
Here are the ones worth knowing.
Flag / Option | Purpose | Example |
---|---|---|
--limit-rate=500k | Throttle download speed | wget --limit-rate=500k url |
--tries=5 | Retry failed downloads | wget --tries=5 url |
-q | Quiet mode (suppress output) | wget -q url |
-b | Run in background | wget -b url |
--no-check-certificate | Ignore SSL cert errors (use cautiously) | wget --no-check-certificate url |
--timeout=10 | Set network timeout in seconds | wget --timeout=10 url |
Limit download speed with --limit-rate
Throttle bandwidth so downloads don't saturate your connection.
subprocess.run(["wget", "--limit-rate=500k", url])
Retry automatically with --tries
Have wget
reattempt failed downloads without extra Python code.
subprocess.run(["wget", "--tries=5", url])
Quiet mode with -q
Suppress progress output if you want a clean script log.
subprocess.run(["wget", "-q", url])
Background mode with -b
The -b
flag tells wget
to detach and keep running in the background, just like if you had started it in a separate shell. Your Python script, however, still waits for the subprocess.run()
call to finish. That means the Python process won't regain control until wget
has been launched successfully and handed off to the background.
subprocess.run(["wget", "-b", url])
If your goal is to let Python move on without waiting, you'd need to use something like subprocess.Popen()
instead of subprocess.run()
:
# Launch wget in the background and return immediately
process = subprocess.Popen(["wget", "-b", "https://getsamplefiles.com/download/zip/sample-1.zip"])
print("Python script keeps running while wget works in the background...")
Ignore SSL certificate errors with --no-check-certificate
Useful when pulling from servers with misconfigured HTTPS (though avoid unless you have to).
subprocess.run(["wget", "--no-check-certificate", url])
Set timeout with --timeout
Prevent downloads from hanging forever on bad connections.
subprocess.run(["wget", "--timeout=10", url])
Combine flags for real-world automation
Many wget
flags can be mixed and matched:
subprocess.run([
"wget",
"--tries=3",
"--timeout=10",
"--limit-rate=1m",
url
])
This balances stability (retries, timeouts) with resource control (rate limiting).
When wget hits its limits, try ScrapingBee
If you're doing a lot of automated downloads, you'll eventually run into roadblocks wget
can't handle well on its own:
- Pages that need JavaScript rendering (dynamic content, load-on-scroll, buttons)
- Sites with aggressive anti-bot / rate limiting / geoblocking
- Needing rotating proxies so you don't get blocked just because you hit one IP too hard
- Complex workflows: extracting structured data, taking screenshots, combining pages
That's where ScrapingBee comes in. Instead of managing headless browsers, proxy pools, and retries yourself, you send an HTTP API request + settings, and it handles:
- JS rendering (so dynamic content appears as in a browser)
- Rotating & premium proxies to avoid bans/blocking
- Extraction rules (CSS/XPath), screenshots, even geotargeting or stealth proxy mode
If you want to see whether ScrapingBee could solve your more complex cases, check out our pricing page. We offer 1,000 free API credits just to try. No credit card needed!
Bonus: How do I download internets? Recursive downloads
One of wget's most powerful features is recursive retrieval. Instead of just grabbing a single file, wget
can crawl through an HTML page, follow its links, and download everything it touches (images, CSS, scripts, or other pages) until it hits the depth limit you set.
Yes, that means you can literally try to download the internet to your PC. You might laugh, but there are folks out there who really do archive the web: we're talking exabytes of data on tape storage. Your laptop might not be up for the same job, but the feature is there if you want to mirror a site or keep an offline snapshot.
Key options for recursion
-r
or--recursive
— enable recursive retrieval.-l <depth>
or--level=<depth>
— set the maximum recursion depth. Default is5
. Use0
orinf
for unlimited depth.-p
or--page-requisites
— download all resources needed to properly display a page (images, CSS, JavaScript).-k
or--convert-links
— rewrite the links in downloaded files so they work for local viewing.
Example: mirror a site locally
wget -r -l 3 -p -k https://www.scrapingbee.com/
This command:
- Downloads the main page at scrapingbee.com.
- Follows links up to depth 3.
- Pulls in page requisites like images, CSS, and JS.
- Converts links so you can open the saved pages in your browser offline.
Note: be careful with
-r
on large sites; you might accidentally try to mirror half the internet.
Now let's check the Python example:
import subprocess
def main():
url = "https://www.scrapingbee.com/"
subprocess.run(["wget", "-r", "-l", "3", "-p", "-k", url])
Recursive mode makes wget
act more like a crawler. It's great for offline mirrors, documentation snapshots, or saving pages with all their assets. But if you need fine-grained control, like executing JavaScript or bypassing anti-bot systems, you'll quickly run into limitations. That's when services like ScrapingBee come into play.
Frequently asked questions
What is "Python wget" and why is it useful for file downloads?
It's shorthand for calling the wget command-line tool from Python using subprocess. You get wget's reliability (resuming, retries, streaming to disk) without re-implementing networking in Python.
import subprocess
subprocess.run(["wget", "https://example.com/file.zip"])
How do I install wget for use with Python scripts?
Install the OS binary; Python just calls it.
# Ubuntu/Debian
sudo apt-get update && sudo apt-get install -y wget
# Fedora
sudo dnf install -y wget
# macOS
brew install wget
# Windows (choco)
choco install wget
# Verify installation
wget --version
Can I customize the filename when downloading with Python wget?
Yes. Use -O <file>
(overwrites if it exists).
import subprocess
subprocess.run(["wget", "-O", "dataset.zip",
"https://getsamplefiles.com/download/zip/sample-1.zip"])
Want unique names? Add a timestamp:
import subprocess, time
name = f"dataset_{int(time.time())}.zip"
subprocess.run(["wget", "-O", name,
"https://getsamplefiles.com/download/zip/sample-1.zip"])
Related: need to translate a
curl
command you already have? Try the Curl Converter.
How can I download files to a specific folder using Python wget?
Two options, so pick one.
Keep the server's filename, choose the folder with -P
:
import subprocess, os
os.makedirs("downloads", exist_ok=True) # for the sake of completeness
subprocess.run(["wget", "-P", "downloads",
"https://getsamplefiles.com/download/zip/sample-1.zip"])
Set both folder and custom name by giving a full path to -O
:
import subprocess, os
os.makedirs("downloads", exist_ok=True)
subprocess.run(["wget", "-O", "downloads/myfile.zip",
"https://getsamplefiles.com/download/zip/sample-1.zip"])
Note:
-O
overrides-P
. If you want both, pass a full path to-O
.
Is it possible to resume interrupted downloads with Python wget?
Yes. Use -c
/ --continue
(works when the server supports range requests).
import subprocess
subprocess.run(["wget", "--continue",
"https://getsamplefiles.com/download/zip/sample-1.zip"])
Combine it with --tries
to let wget
retry automatically if the connection drops.
import subprocess
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
subprocess.run([
"wget",
"--continue", # resume partial downloads
"--tries=3", # retry up to 3 times on failure
url
])
Can I limit download speed with Python wget?
Yes. Use the --limit-rate
flag to throttle bandwidth so downloads don't hog your connection.
import subprocess
url = "https://getsamplefiles.com/download/zip/sample-1.zip"
subprocess.run(["wget", "--limit-rate=500k", url])
In this example, 500k means about 500 KB/s. You can also use m
for megabytes per second (e.g., --limit-rate=2m
).
Can I download multiple files at once?
Yes. You've got two common approaches.
Use a file list: create a text file with one URL per line, then run:
import subprocess
subprocess.run(["wget", "-i", "urls.txt"])
Use loop in Python: handy if you're generating URLs dynamically:
import subprocess
urls = [
"https://getsamplefiles.com/download/zip/sample-1.zip",
"https://getsamplefiles.com/download/zip/sample-2.zip"
]
for u in urls:
subprocess.run(["wget", u])
Conclusion
Using Python with wget is one of the simplest ways to automate file downloads. With just a few lines of code you can fetch files, save them with custom names, organize them into folders, or even resume interrupted transfers. Add a couple of optional flags and you've got a reliable tool for everyday automation.
That said, wget
is still just a downloader. If your project goes beyond fetching static files, you'll probably need different tools:
- For downloading files over more protocols (like SFTP or wider HTTP features), check out cURL with Python.
- To extract only certain parts of a page (e.g., product titles, prices) instead of saving the whole file, use Requests with Beautiful Soup.
- To interact with pages that require clicks, scrolling, or JavaScript rendering, Selenium is the go-to choice.
- And if you'd rather skip managing browsers, proxies, or bot detection entirely, consider a hosted service like ScrapingBee.
Thank you for staying with me, and until next time.