Limited Time Offer: Use code CYBER at checkout and get 50% off for your 1st month! Start Free Trial 🐝

Python wget: Automate file downloads with 3 simple commands

24 September 2025 | 17 min read

If you've ever needed to grab files in bulk, you know the pain of clicking download links one by one. That's where combining Python and wget shines. Instead of re-implementing HTTP requests yourself, you can call the battle-tested wget tool straight from a Python script and let it handle the heavy lifting.

In this guide, we'll set up wget, explain how to run it from Python using subprocess, and walk through three copy-paste commands that cover almost everything you'll ever need: downloading a file, saving it with a custom name or folder, and resuming interrupted transfers. Let's get started!

cover image

Quick answer (TL;DR)

If you just need the basics, here's the fast track.

1. Install wget

# Ubuntu/Debian
sudo apt-get update && sudo apt-get install -y wget

# macOS
brew install wget

# Windows (choco)
choco install wget

2. Write a Python script to use wget with subprocess

import subprocess, os, time

# URL to download
url = "https://getsamplefiles.com/download/zip/sample-1.zip"

# Here are a few common approaches:

## 1. Basic download
subprocess.run(["wget", url])

## 2. Custom filename + folder
os.makedirs("downloads", exist_ok=True)
subprocess.run(["wget", "-O", "downloads/myfile.zip", url])

## 3. Resume interrupted download
subprocess.run(["wget", "--continue", url])

That's it:

  • Grab the file with the default name.
  • Save it as downloads/myfile.zip.
  • Resume if the file is already partially downloaded.

Why use wget?

wget is a cross-platform command-line tool for downloading files over HTTP, HTTPS, and FTP. It's been around for decades, and developers still reach for it because it's reliable, flexible, and dead simple to use.

Two features make it stand out:

  • Robustness — if your connection drops, wget can pick up right where it left off instead of starting over from zero.
  • Recursive downloads — with the right flags, wget can follow links on a page and slurp down entire directory trees, not just single files.

That combo makes it perfect for scripting and automation, especially when you pair it with Python.

Installing and setting up wget for Python scripts

Before we start automating downloads, there are two pieces you need in place:

  • Python — where we'll write the automation logic.
  • wget — the star of the show, a command-line tool for fetching files over HTTP, HTTPS, and FTP.

Since wget is an external program and not a Python package, we'll call it from Python using the built-in subprocess module. Think of it like typing the same command in your terminal, just wrapped in Python code. This keeps things simple: no need to reinvent downloading when wget already does the heavy lifting.

Setting up Python 3

First, check you're on Python 3.10 or later:

python --version
pip --version

(Some systems use python3 and pip3 instead.)

To avoid version headaches, I really recommend managing projects with uv or a similar solution like Poetry. To prepare a new uv project, simply run:

uv init python-wget-demo
cd python-wget-demo
uv run python --version

Create a main.py file in your project:

def main():
    # our code goes here


if __name__ == "__main__":
    main()

You can run this script easily with or without uv:

python main.py

uv run python main.py

Nice, let's move on!

Python subprocess module

Python comes with the subprocess module by default. It's the standard way to execute external commands from your code. For example:

import subprocess

def main():
    subprocess.run(["ping", "127.0.0.1"])

# For brevity I'll skip this boilerplate in later examples
if __name__ == "__main__":
    main()

Running this script gives you something like:

Pinging 127.0.0.1 with 32 bytes of data:
Reply from 127.0.0.1: bytes=32 time<1ms TTL=64

Well it's definitely not the most useful program in the world as it simply pings your own computer. But at least it shows that Python can successfully launch system commands through subprocess.

⚠️ Quick tip from someone who's been around IT for a while (yep, that's me). Be careful with scripts that call subprocess, especially if they seem obscure (regardless of the language or technology). It's a direct bridge to your shell, so if someone hides a malicious command inside, your computer will happily execute it. A friend of mine once ran a Node.js app he found online, and it quietly executed commands that scraped his browser data and crypto wallet files. No warnings, no popups — and his money was gone.

So the rule of thumb: never run unknown code that uses something simialar to subprocess.run() unless you trust the source and understand what the command is doing. Other languages have the same sharp edge; in Node.js, for example, it’s child_process.exec(). Okay, paranoid mode off.

Installing wget on Linux, macOS, and Windows

Next, install wget itself. Use the command for your OS:

Ubuntu / Debian:

sudo apt-get update && sudo apt-get install -y wget

Fedora:

sudo dnf install -y wget

macOS (Homebrew):

brew install wget

Windows (choco):

choco install wget

(wget should be available in other package managers for Windows.)

Verifying wget installation

Confirm that wget is installed and available on your PATH:

wget --version

Sample output might look like:

GNU Wget 1.21.4 built on linux-gnu.

On Unix-like systems, check the exit code:

echo $?
# 0

Exit code 0 means the command succeeded. If you see a different number, it usually indicates an error in installation or PATH setup.

💡 Tip: New to scraping in general? Check out Python Web Scraping: Full Tutorial With Examples (2025). It covers the fundamentals before you dive into automating downloads with wget.

3 simple Python wget commands explained

With everything installed, let's get wget to work. Okay, that was a weird pun. Anyways. In practice, most automation scripts only need a handful of patterns, and you can cover 90% of real-world cases with three simple commands:

  • Download a file using subprocess.run(["wget", url])
  • Save with a custom filename using the -O flag or choose a folder using -P
  • Resume downloads with -c

We'll go through them one by one with copy-paste Python snippets you can drop straight into your project. After that, we'll look at a few extra wget tricks so you've got more than just the basics covered.

1. Download a file using subprocess.run

The simplest way to use wget is to grab a single file. If you were doing this directly in your terminal, the command would look like:

wget http://example.com/archive.zip

It fetches a ZIP archive and saves it to the current folder.

To run the same command from Python, wrap it with subprocess.run():

import subprocess

def main():
    url = "https://getsamplefiles.com/download/zip/sample-1.zip"
    result = subprocess.run(["wget", url])

    if result.returncode == 0:
        print("Download completed successfully.")
    else:
        print(f"Download failed with exit code {result.returncode}")

A few important things to note here:

Stdout and stderr

Every program you run on your computer can print text to two different "channels":

  • stdout ("standard output") is for normal messages, like progress bars or results.
  • stderr ("standard error") is for error messages, like "file not found" or "connection failed."

Think of it like the Witcher's two swords: one for monsters, one for humans. Both are important, just for different kinds of battles. So, when you run wget in a terminal, both channels are automatically displayed on screen. That's why you see progress updates or error messages right away, without needing extra Python code to capture them.

If you do want to capture these messages inside Python (for logging or debugging), you can add parameters like capture_output=True or stderr=subprocess.PIPE. But for now, letting them print directly is simplest.

Exit code

Every time a program finishes running, it gives the operating system a small number called an exit code. In fact, we've already seen it a few moments ago when installing wget. By convention:

  • 0 means success (everything worked as expected).
  • Any other number means some kind of failure.

For example, 4 might mean a network failure, 8 could mean a server error, etc.

In Python, you get this number via result.returncode. Checking it lets your script decide what to do if a download fails, instead of silently moving on.

Large files

One of the reasons wget is still popular: it streams data straight to disk as it downloads. That means if you grab a 2GB file, it doesn't try to hold the entire thing in RAM. It writes chunks to disk as they arrive.

In Python's requests or urllib libraries, you'd have to handle this logic yourself, but with wget it's built in. So whether it's a small test ZIP or a huge dataset, the command works the same way.

Here's the sample output for our small program:

--2025-09-23 16:53:24--  https://getsamplefiles.com/download/zip/sample-1.zip
Resolving getsamplefiles.com (getsamplefiles.com)... 104.21.9.51, 172.67.141.191
Connecting to getsamplefiles.com (getsamplefiles.com)|104.21.9.51|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1099121 (1.0M) [application/zip]
Saving to: ‘sample-1.zip'

sample-1.zip                  100%[=================================================>]   1.05M  6.64MB/s    in 0.2s

2025-09-23 16:53:25 (6.64 MB/s) - ‘sample-1.zip' saved [1099121/1099121]

Download completed successfully.

That's all it takes to get a basic Python wget download running in your script. Now let's step it up and see how to control the filename rather than relying on whatever the server decides to use.

2. Save with a custom filename using -O or choose folder with -P

By default, wget saves the file with whatever name the server provides and drops it into the current directory. That's fine for quick tests, but in real scripts you'll usually want more control over where the file goes and what it's called.

There are two different flags for that but keep in mind they don't combine:

  • -O <file>: forces a specific filename.
  • -P <dir>: chooses the folder but keeps the server's filename.

Let's see them in action.

⚠️ Note: If you try to use -O and -P together, wget ignores -P. If you really want both custom name and folder, just give -O a full path like downloads/dataset.zip.

Using -O for a custom name

On the command line:

wget -O dataset.zip https://getsamplefiles.com/download/zip/sample-1.zip

In Python:

import subprocess

def main():
    url = "https://getsamplefiles.com/download/zip/sample-1.zip"
    custom_name = "dataset.zip"

    result = subprocess.run(["wget", "-O", custom_name, url])

    if result.returncode == 0:
        print(f"Saved file as {custom_name}")
    else:
        print("Download failed")

The idea is the same as in the script above, we've just added a new command line parameter. Here wget will always save the file with that exact name in the current working directory.

Note on the overwrite behavior:

If a file with the same name already exists, wget will overwrite it without warning. To avoid silent overwrites, you can append a timestamp or suffix in Python:

import subprocess, time

def main():
    url = "https://getsamplefiles.com/download/zip/sample-1.zip"
    timestamp = int(time.time())
    custom_name = f"dataset_{timestamp}.zip"

    subprocess.run(["wget", "-O", custom_name, url])
    print(f"Saved as {custom_name}")

It's a pretty common trick. Another option is to check with os.path.exists() before downloading.

Using -P for a custom folder

On the command line:

wget -P downloads https://getsamplefiles.com/download/zip/sample-1.zip

This saves the file into a downloads/ folder, but keeps the original filename (sample-1.zip).

In Python:

import subprocess, os

def main():
    url = "https://getsamplefiles.com/download/zip/sample-1.zip"
    directory = "downloads"
    os.makedirs(directory, exist_ok=True)

    result = subprocess.run(["wget", "-P", directory, url])

    if result.returncode == 0:
        print(f"Saved into {directory}/")
    else:
        print("Download failed")

Why os.makedirs?

Strictly speaking, you might not always need that line. Some builds of wget (especially on Windows) will create the folder for you. But on some OSes, wget -P <dir> might fail if the directory doesn't already exist.

So, calling os.makedirs(directory, exist_ok=True) might feel like overkill, but it keeps the script portable and a bit safer.

3. Resume interrupted downloads with -c / --continue

Large downloads are fragile: a flaky Wi-Fi connection or a dropped VPN can kill them halfway. Instead of starting from scratch, wget can pick up where it left off using the -c (or --continue) flag.

# start download, then interrupt with Ctrl+C
wget https://getsamplefiles.com/download/zip/sample-5.zip

# resume where it left off
wget --continue https://getsamplefiles.com/download/zip/sample-5.zip

In the terminal you should see:

HTTP request sent, awaiting response... 206 Partial Content
Length: 8402383 (8.0M), 5982607 (5.7M) remaining [application/zip]

If a partial file already exists, wget checks its size and requests the missing bytes. This only works if the server advertises support for range requests (Accept-Ranges: bytes header). If not, wget will re-download the whole file.

Now let's check Python example:

import subprocess

def main():
    url = "https://getsamplefiles.com/download/zip/sample-1.zip"
    result = subprocess.run(["wget", "--continue", url])

    if result.returncode == 0:
        print("Download completed (resumed if partial file existed).")
    else:
        print("Download failed or server does not support resume.")

Retrying automatically

For unreliable connections, you can wrap the wget --continue call in a small retry loop. Here's a simple version that tries up to 3 times with a pause in between:

import subprocess, time

def download_with_retry(url, retries=3, delay=5):
    for attempt in range(1, retries + 1):
        print(f"Attempt {attempt}...")
        result = subprocess.run(["wget", "--continue", url])
        if result.returncode == 0:
            print("Download completed.")
            return True
        else:
            print(f"Failed (exit {result.returncode}). Retrying in {delay}s...")
            time.sleep(delay)
    return False

if __name__ == "__main__":
    download_with_retry("https://getsamplefiles.com/download/zip/sample-1.zip")

This way, even if your network drops out, the script keeps retrying until the download is complete.

Boosting Python wget downloads with optional flags

The three core commands cover most cases, but wget also ships with a ton of extra options. You don't need all of them — just a handful that improve stability and speed in automation scripts.

Here are the ones worth knowing.

Flag / OptionPurposeExample
--limit-rate=500kThrottle download speedwget --limit-rate=500k url
--tries=5Retry failed downloadswget --tries=5 url
-qQuiet mode (suppress output)wget -q url
-bRun in backgroundwget -b url
--no-check-certificateIgnore SSL cert errors (use cautiously)wget --no-check-certificate url
--timeout=10Set network timeout in secondswget --timeout=10 url

Limit download speed with --limit-rate

Throttle bandwidth so downloads don't saturate your connection.

subprocess.run(["wget", "--limit-rate=500k", url])

Retry automatically with --tries

Have wget reattempt failed downloads without extra Python code.

subprocess.run(["wget", "--tries=5", url])

Quiet mode with -q

Suppress progress output if you want a clean script log.

subprocess.run(["wget", "-q", url])

Background mode with -b

The -b flag tells wget to detach and keep running in the background, just like if you had started it in a separate shell. Your Python script, however, still waits for the subprocess.run() call to finish. That means the Python process won't regain control until wget has been launched successfully and handed off to the background.

subprocess.run(["wget", "-b", url])

If your goal is to let Python move on without waiting, you'd need to use something like subprocess.Popen() instead of subprocess.run():

# Launch wget in the background and return immediately
process = subprocess.Popen(["wget", "-b", "https://getsamplefiles.com/download/zip/sample-1.zip"])

print("Python script keeps running while wget works in the background...")

Ignore SSL certificate errors with --no-check-certificate

Useful when pulling from servers with misconfigured HTTPS (though avoid unless you have to).

subprocess.run(["wget", "--no-check-certificate", url])

Set timeout with --timeout

Prevent downloads from hanging forever on bad connections.

subprocess.run(["wget", "--timeout=10", url])

Combine flags for real-world automation

Many wget flags can be mixed and matched:

subprocess.run([
    "wget",
    "--tries=3",
    "--timeout=10",
    "--limit-rate=1m",
    url
])

This balances stability (retries, timeouts) with resource control (rate limiting).

When wget hits its limits, try ScrapingBee

If you're doing a lot of automated downloads, you'll eventually run into roadblocks wget can't handle well on its own:

  • Pages that need JavaScript rendering (dynamic content, load-on-scroll, buttons)
  • Sites with aggressive anti-bot / rate limiting / geoblocking
  • Needing rotating proxies so you don't get blocked just because you hit one IP too hard
  • Complex workflows: extracting structured data, taking screenshots, combining pages

That's where ScrapingBee comes in. Instead of managing headless browsers, proxy pools, and retries yourself, you send an HTTP API request + settings, and it handles:

  • JS rendering (so dynamic content appears as in a browser)
  • Rotating & premium proxies to avoid bans/blocking
  • Extraction rules (CSS/XPath), screenshots, even geotargeting or stealth proxy mode

If you want to see whether ScrapingBee could solve your more complex cases, check out our pricing page. We offer 1,000 free API credits just to try. No credit card needed!

Bonus: How do I download internets? Recursive downloads

One of wget's most powerful features is recursive retrieval. Instead of just grabbing a single file, wget can crawl through an HTML page, follow its links, and download everything it touches (images, CSS, scripts, or other pages) until it hits the depth limit you set.

Yes, that means you can literally try to download the internet to your PC. You might laugh, but there are folks out there who really do archive the web: we're talking exabytes of data on tape storage. Your laptop might not be up for the same job, but the feature is there if you want to mirror a site or keep an offline snapshot.

Key options for recursion

  • -r or --recursive — enable recursive retrieval.
  • -l <depth> or --level=<depth> — set the maximum recursion depth. Default is 5. Use 0 or inf for unlimited depth.
  • -p or --page-requisites — download all resources needed to properly display a page (images, CSS, JavaScript).
  • -k or --convert-links — rewrite the links in downloaded files so they work for local viewing.

Example: mirror a site locally

wget -r -l 3 -p -k https://www.scrapingbee.com/

This command:

  • Downloads the main page at scrapingbee.com.
  • Follows links up to depth 3.
  • Pulls in page requisites like images, CSS, and JS.
  • Converts links so you can open the saved pages in your browser offline.

Note: be careful with -r on large sites; you might accidentally try to mirror half the internet.

Now let's check the Python example:

import subprocess

def main():
    url = "https://www.scrapingbee.com/"
    subprocess.run(["wget", "-r", "-l", "3", "-p", "-k", url])

Recursive mode makes wget act more like a crawler. It's great for offline mirrors, documentation snapshots, or saving pages with all their assets. But if you need fine-grained control, like executing JavaScript or bypassing anti-bot systems, you'll quickly run into limitations. That's when services like ScrapingBee come into play.

Frequently asked questions

What is "Python wget" and why is it useful for file downloads?

It's shorthand for calling the wget command-line tool from Python using subprocess. You get wget's reliability (resuming, retries, streaming to disk) without re-implementing networking in Python.

import subprocess
subprocess.run(["wget", "https://example.com/file.zip"])

How do I install wget for use with Python scripts?

Install the OS binary; Python just calls it.

# Ubuntu/Debian
sudo apt-get update && sudo apt-get install -y wget

# Fedora
sudo dnf install -y wget

# macOS
brew install wget

# Windows (choco)
choco install wget

# Verify installation
wget --version

Can I customize the filename when downloading with Python wget?

Yes. Use -O <file> (overwrites if it exists).

import subprocess
subprocess.run(["wget", "-O", "dataset.zip",
                "https://getsamplefiles.com/download/zip/sample-1.zip"])

Want unique names? Add a timestamp:

import subprocess, time
name = f"dataset_{int(time.time())}.zip"
subprocess.run(["wget", "-O", name,
                "https://getsamplefiles.com/download/zip/sample-1.zip"])

Related: need to translate a curl command you already have? Try the Curl Converter.

How can I download files to a specific folder using Python wget?

Two options, so pick one.

Keep the server's filename, choose the folder with -P:

import subprocess, os
os.makedirs("downloads", exist_ok=True) # for the sake of completeness
subprocess.run(["wget", "-P", "downloads",
                "https://getsamplefiles.com/download/zip/sample-1.zip"])

Set both folder and custom name by giving a full path to -O:

import subprocess, os
os.makedirs("downloads", exist_ok=True)
subprocess.run(["wget", "-O", "downloads/myfile.zip",
                "https://getsamplefiles.com/download/zip/sample-1.zip"])

Note: -O overrides -P. If you want both, pass a full path to -O.

Is it possible to resume interrupted downloads with Python wget?

Yes. Use -c / --continue (works when the server supports range requests).

import subprocess
subprocess.run(["wget", "--continue",
                "https://getsamplefiles.com/download/zip/sample-1.zip"])

Combine it with --tries to let wget retry automatically if the connection drops.

import subprocess

url = "https://getsamplefiles.com/download/zip/sample-1.zip"
subprocess.run([
    "wget",
    "--continue",   # resume partial downloads
    "--tries=3",    # retry up to 3 times on failure
    url
])

Can I limit download speed with Python wget?

Yes. Use the --limit-rate flag to throttle bandwidth so downloads don't hog your connection.

import subprocess

url = "https://getsamplefiles.com/download/zip/sample-1.zip"
subprocess.run(["wget", "--limit-rate=500k", url])

In this example, 500k means about 500 KB/s. You can also use m for megabytes per second (e.g., --limit-rate=2m).

Can I download multiple files at once?

Yes. You've got two common approaches.

Use a file list: create a text file with one URL per line, then run:

import subprocess
subprocess.run(["wget", "-i", "urls.txt"])

Use loop in Python: handy if you're generating URLs dynamically:

import subprocess

urls = [
    "https://getsamplefiles.com/download/zip/sample-1.zip",
    "https://getsamplefiles.com/download/zip/sample-2.zip"
]

for u in urls:
    subprocess.run(["wget", u])

Conclusion

Using Python with wget is one of the simplest ways to automate file downloads. With just a few lines of code you can fetch files, save them with custom names, organize them into folders, or even resume interrupted transfers. Add a couple of optional flags and you've got a reliable tool for everyday automation.

That said, wget is still just a downloader. If your project goes beyond fetching static files, you'll probably need different tools:

  • For downloading files over more protocols (like SFTP or wider HTTP features), check out cURL with Python.
  • To extract only certain parts of a page (e.g., product titles, prices) instead of saving the whole file, use Requests with Beautiful Soup.
  • To interact with pages that require clicks, scrolling, or JavaScript rendering, Selenium is the go-to choice.
  • And if you'd rather skip managing browsers, proxies, or bot detection entirely, consider a hosted service like ScrapingBee.

Thank you for staying with me, and until next time.

image description
Roel Peters

Roel Peters is a data generalist with a special interest in making business cases using structured data and simulation techniques.