Scraping Google Hot Trends

Everybody likes hot content. That’s why Google Trends is so popular among blackhats. It says what’s hot and what’s not. So let’s scrap!

Why care about trends?

Traffic. If you know the keywords people use when they search, you can bring more traffic to your websites. If you know the hot keywords, you can bring HUGE traffic. Those who just don’t get it,

Google Hot Trends = Hot keywords

Google Hot Trends RSS scraper

Pluses

  • Google doesn’t mind RSS scraping.
  • Easy to automate.

Minuses

  • You can’t specify a date.
<?php
$page = file_get_contents(
    'http://www.google.com/trends/hottrends/atom/hourly');
preg_match_all('(<a href="(.+)">(.*)</a>)siU', $page, $matches);
 
// Job’s done!
// $matches[1] array contains all URLs, and 
// $matches[2] array contains all anchors
?>

Scraping Google Hot Trends website

Pluses

  • You can specify dates.

Minuses

<?php
// Scraping New Year’s Eve
$result = getPage(
    '[proxy IP]:[port]',
    'http://www.google.com/trends/hottrends?sa=X&date=2008-12-31',
    'http://www.google.com/trends/hottrends?sa=X',
    'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8',
    1,
    5);
 
if (empty($result['ERR'])) {    
    preg_match_all(
        '(<td class=num>\d+\.</td>.*<td><a href="(.*)">(.*)</a></td>)siU',
        $result['EXE'], $matches);
 
    // some URL tuning here…
    for ($i = 0; $i < count($matches[1]); $i++) {
        $matches[1][$i] = 'http://www.google.com' . $matches[1][$i];
    }
 
    // Job's done! 
    // $matches[1] array contains all URLs, and 
    // $matches[2] array contains all anchors
} else {
    // WTF? Captcha or network problems? 
    // ...
}
?>

Take care!

2 thoughts on “Scraping Google Hot Trends”

Leave a Reply

Your email address will not be published. Required fields are marked *