How to handle infinite scroll pages in Go

Nowadays, most websites use different methods and techniques to decrease the load and data served to their clients’ devices. One of these techniques is the infinite scroll.

In this tutorial, we will see how we can scrape infinite scroll web pages using a  js_scenario , specifically the& scroll_y and scroll_x features. And we will use  this page  as a demo. Only 9 boxes are loaded when we first open the page, but as soon as we scroll to the end of it, we will load 9 more, and that will keep happening each time we scroll to the bottom of the page.

First let’s make a request without the scroll_y parameter and see what the result looks like. We will use this code:

package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
    "net/url"
    "os"
)

func get_request(api_key string, user_url string) (*http.Response, error) {
    // Create client
    client := &http.Client{}

    my_url := url.QueryEscape(user_url) // Encoding the URL
    // Create request
    req, err := http.NewRequest("GET", "https://app.scrapingbee.com/api/v1/?api_key="+api_key+"&url="+my_url, nil) // Create the request the request

    parseFormErr := req.ParseForm()
    if parseFormErr != nil {
        fmt.Println(parseFormErr)
    }

    // Fetch Request
    resp, err := client.Do(req)

    if err != nil {
        fmt.Println(err)
    }

    return resp, err // Return the response
}

func save_page_to_html(file_path string, webpage string) {
    api_key := "YOUR-API-KEY"
    request, err := get_request(api_key, webpage)
    if err != nil {
        fmt.Println(err)
        return
    }
    // Read Response Body
    respBody, _ := ioutil.ReadAll(request.Body)

    file, err := os.Create(file_path)
    if err != nil {
        fmt.Println(err)
        return
    }
    l, err := file.WriteString(string(respBody)) // Write content to the file.
    if err != nil {
        fmt.Println(err)
        file.Close()
        return
    }
    fmt.Println(file_path, " file has been saved successfully", l)
    err = file.Close()
    if err != nil {
        fmt.Println(err)
        return
    }
}

func main() {

save_page_to_html("infinite.html", "https://demo.scrapingbee.com/infinite_scroll.html")

}

And the result as you will see below the first 9 pre-loaded blocks.So for websites that have infinite scroll, you will not be able to extract information efficiently without scroll_y.

The code below will scroll to the end of the page and wait for 500 milliseconds two times, then save the result in an HTML document.

package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
    "net/url"
    "os"
)

func get_request(api_key string, user_url string) (*http.Response, error) {
    // Create client
    client := &http.Client{}

    my_url := url.QueryEscape(user_url) // Encoding the URL
    // Adding a JavaScript Scenario and encoding it
        js_scenario := url.QueryEscape(`{"instructions": [{ "scroll_y": 1080 }, {"wait": 500}, { "scroll_y": 1080 }, {"wait": 500}]}`)
    // Create request
    req, err := http.NewRequest("GET", "https://app.scrapingbee.com/api/v1/?api_key="+api_key+"&url="+my_url+"&js_scenario="+js_scenario, nil) // Create the request the request

    parseFormErr := req.ParseForm()
    if parseFormErr != nil {
        fmt.Println(parseFormErr)
    }

    // Fetch Request
    resp, err := client.Do(req)

    if err != nil {
        fmt.Println(err)
    }

    return resp, err // Return the response
}

func save_page_to_html(file_path string, webpage string) {
    api_key := "YOUR-API-KEY"
    request, err := get_request(api_key, webpage)
    if err != nil {
        fmt.Println(err)
        return
    }
    // Read Response Body
    respBody, _ := ioutil.ReadAll(request.Body)

    file, err := os.Create(file_path)
    if err != nil {
        fmt.Println(err)
        return
    }
    l, err := file.WriteString(string(respBody)) // Write content to the file.
    if err != nil {
        fmt.Println(err)
        file.Close()
        return
    }
    fmt.Println(file_path, " file has been saved successfully", l)
    err = file.Close()
    if err != nil {
        fmt.Println(err)
        return
    }
}

func main() {

save_page_to_html("infinite.html", "https://demo.scrapingbee.com/infinite_scroll.html")

}

And as you can see below, we managed to scrape 18 blocks. We can even go further and scrape more blocks if wanted by adding more scroll_y instructions.

Go back to tutorials