Scraping Google Hot Trends

Everybody likes hot content. That’s why Google Trends is so popular among blackhats. It says what’s hot and what’s not. So let’s scrap!

Why care about trends?

Traffic. If you know the keywords people use when they search, you can bring more traffic to your websites. If you know the hot keywords, you can bring HUGE traffic. Those who just don’t get it,

Google Hot Trends = Hot keywords

Google Hot Trends RSS scraper

Pluses

  • Google doesn’t mind RSS scraping.
  • Easy to automate.

Minuses

  • You can’t specify a date.
<?php
$page = file_get_contents(
    'http://www.google.com/trends/hottrends/atom/hourly');
preg_match_all('(<a href="(.+)">(.*)</a>)siU', $page, $matches);
 
// Job’s done!
// $matches[1] array contains all URLs, and 
// $matches[2] array contains all anchors
?>

Scraping Google Hot Trends website

Pluses

  • You can specify dates.

Minuses

<?php
// Scraping New Year’s Eve
$result = getPage(
    '[proxy IP]:[port]',
    'http://www.google.com/trends/hottrends?sa=X&date=2008-12-31',
    'http://www.google.com/trends/hottrends?sa=X',
    'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8',
    1,
    5);
 
if (empty($result['ERR'])) {    
    preg_match_all(
        '(<td class=num>\d+\.</td>.*<td><a href="(.*)">(.*)</a></td>)siU',
        $result['EXE'], $matches);
 
    // some URL tuning here…
    for ($i = 0; $i < count($matches[1]); $i++) {
        $matches[1][$i] = 'http://www.google.com' . $matches[1][$i];
    }
 
    // Job's done! 
    // $matches[1] array contains all URLs, and 
    // $matches[2] array contains all anchors
} else {
    // WTF? Captcha or network problems? 
    // ...
}
?>

Take care!

  • Twitter
  • del.icio.us
  • Facebook
  • MySpace
  • Google Bookmarks
  • Technorati
  • StumbleUpon
  • Digg
  • Reddit
  • Sphinn
  • Slashdot
  • NewsVine
  • Propeller
  • Tumblr
  • BlinkList
  • Faves
  • LinkedIn
  • Mixx
  • Netvibes
  • connotea
  • MisterWong

You may want to subscribe to my RSS feed.

Freelance Jobs

3 comments.

  1. [...] Here is some info to make this easy. This is Google Trends: http://google.com/trends/hottrends The RSS site to scrape is here: http://www.google.com/trends/hottrends/atom/*ourly (replace the * with an h as scriptlance won’t let me use that word!)It is the *ourly feed for Google Trends. And here is some simple code to do the scrape: http://www.fromzerotoseo.com/scraping-google-hot-trends/ [...]

  2. Darwin Studios is ready to pay $75 for Google Hot trends scraper http://www.scriptlance.com/projects/1242054657.shtml

    Bidding Ends 5/15/2009 at 11:10 EST. Go ahead blackhats! :)

  3. Hey, what the mean of scraping?

Post a comment.