Data can be found online in various formats, but the most popular one is table format, especially that it displays information in a very structured and well organized layout. So it is very important to be able to extract data from tables with ease.
And this is of the most important features of ScrapingBee's data extraction tool, you can scrape data from tables without having to do any post-processing of the HTML response. We can use this feature by specifying a table's CSS selector within a set of extract_rules
, and let ScrapingBee do the rest!
In this example, we're going to scrape NASDAQ's top 100 stock prices from this demo page .
The CSS selector of the table that contains the information we need is .BasicTable-table
.
So, our code will look like this:
const scrapingbee = require('scrapingbee'); // Import ScrapingBee's SDK
async function scrape_table(url) {
var client = new scrapingbee.ScrapingBeeClient('YOUR-API-KEY'); // New ScrapingBee client
var response = await client.get({
url: url,
params: { // Parameters:
'extract_rules':{
"table_json" : {
"selector": ".table",
"output": "table_json" // Extracting data in JSON representation
},
"table_array" : {
"selector": ".table",
"output": "table_array" // Extracting data in Array representation
},
}
}
});
return response;
}
scrape_table("https://demo.scrapingbee.com/table_content.html").then(function (response) {
var decoder = new TextDecoder();
var text = decoder.decode(response.data);
console.log(text);
}).catch((e) => console.log('A problem occurs : ' + e.response.data));
And the result will be like this:
{"table_json": [{"SYMBOL ": "AMD", "NAME ": "Advanced Micro Devices Inc", "PRICE ": "94.82", "CHANGE ": "-3.98", "%CHANGE ": "-4.03"},...], "table_array": [["AMD", "Advanced Micro Devices Inc", "94.82", "-3.98", "-4.03"],...]}
You can find more details about the differences between JSON Representation and Array Representation in our Data Extraction documentation page .
Go back to tutorials