Yahoo SERP scraper is a little more difficult to implement than Google SERP scraper. Yahoo guys are mad about redirects (former blackhats?). You have to clean URLs after them. But nothing can stop you from scraping 😉

Scraper code example

First time here? Read about scraping websites with PHP cURL under proxy. You will find getPage source code there.

$result = getPage(
    '[proxy IP]:[port]', // get a proxy from somewhere
    'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/2009032609 Firefox/3.0.8',
if (empty($result['ERR'])) {
    preg_match_all('(<h3><a class.*href="(.*)".*>(.*)</a>)siU',
        $result['EXE'], $matches);
    for ($i = 0; $i < count($matches[1]); $i++) {
        // decode url
        $matches[1][$i] = urldecode($matches[1][$i]);
        // get rid of redirect
            $matches[1][$i], $urls);
        $matches[1][$i] = $urls[1][0];
    // strip tags
    for ($i = 0; $i < count($matches[2]); $i++) {
        $matches[2][$i] = strip_tags($matches[2][$i]);
    // Job’s done!
    // $matches[1] contains URLs 
    // $matches[2] contains anchors
    // …
} else {
    // Something went wrong... 

P.S.: Some URLs can still be unreadable (…). Don’t panic 🙂 There’s a workaround.


Take care

