Yes, you can use XPath selectors in
DOM Crawler
. Here is some sample code that uses
Guzzle
to load the
ScrapingBee website
and then uses DOM Crawler's
filterXPath
method
to extract and print the text content of the h1
tag:
use Symfony\Component\DomCrawler\Crawler;
use GuzzleHttp\Client;
// Create a client to make the HTTP request
$client = new \GuzzleHttp\Client();
$response = $client->get('https://www.scrapingbee.com/');
$html = (string) $response->getBody();
// Load the HTML document
$crawler = new Crawler($html);
// Find the first h1 element on the page
$h1 = $crawler->filterXPath('//h1[1]');
// Get the text content of the h1 element
$text = $h1->text();
// Print the text content
echo $text;
// Output:
// "Tired of getting blocked while scraping the web?"
If you do not want to use Guzzle, take a look at this sample code that directly passes in an HTML string:
use Symfony\Component\DomCrawler\Crawler;
use GuzzleHttp\Client;
$html = <<<EOD
<!DOCTYPE html>
<html>
<head>
<title>Example Page</title>
</head>
<body>
<h1>Hello, world!</h1>
<p>This is an example page.</p>
</body>
</html>
EOD;
// Load the HTML document
$crawler = new Crawler($html);
// Find the first h1 element on the page
$h1 = $crawler->filterXPath('//h1[1]');
// Get the text content of the h1 element
$text = $h1->text();
// Print the text content
echo $text;
// Output:
// "Hello, world!"