Scraping Bing SERP

Bing is no exception when it comes to scraping.

$result = getPage(
    '[proxy IP]:[port]', // get a proxy from somewhere
    'http://www.bing.com/search?q=twitter',
    'http://www.bing.com/',
    'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8',
    1,
    5);
 
if (empty($result['ERR'])) {
 
    preg_match_all(
        '(<div class="sb_tlst">.*<h3>.*<a href="(.*)".*>(.*)</a>.*</h3>.*</div>)siU',
        $result['EXE'], $matches);
 
    for ($i = 0; $i < count($matches[2]); $i++) {
        $matches[2][$i] = strip_tags($matches[2][$i]);
    }
 
    // Job’s done!
    // $matches[1] array contains all URLs, and
    // $matches[2] array contains all anchors
    // …
} else {
    // WTF? Problems?
    // ...
}

Grab the getPage function from Scraping websites with PHP cURL under proxy.

  • Twitter
  • del.icio.us
  • Facebook
  • MySpace
  • Google Bookmarks
  • Technorati
  • StumbleUpon
  • Digg
  • Reddit
  • Sphinn
  • Slashdot
  • NewsVine
  • Propeller
  • Tumblr
  • BlinkList
  • Faves
  • LinkedIn
  • Mixx
  • Netvibes
  • connotea
  • MisterWong

You may want to subscribe to my RSS feed.

Freelance Jobs

13 comments.

  1. Hi, Thanks for the tutorial code. If I want to put the call in a loop to scrape more than the 1st page results, is there a way of throttling the calls to appear more natural?

  2. Winalot, experiment with random timeout.

  3. Hi, Have bing.com changed the html of their results? The regex does not seem to be working anymore?

  4. Hi Winalot, I’ve updated the regex for you. Bing can’t escape us :twisted:

  5. Thanks! Keep up the good work.

  6. Hi seozero, Have you had any luck scraping eBay search results? Since they removed their XML feeds I’ve been looking for a way to scrape search results, especially for sold items to see trends etc. Can you work your scraping magic on those? Thanks!

  7. Winalot, I’ve never scraped ebay before, but you made me think of it.

    Btw, have you looked at http://developer.ebay.com/products/research/ and http://developer.researchadvanced.com/pages/developers_area/ebay_research_api/api_call_reference.html ?

    I think API can meet your needs.

  8. Hi seozero, Thanks for your reply.

    I’m part of the eBay developer network and use their sales API quite a bit.

    The problem is the market API is not free, see http://developer.ebay.com/programs/marketdata/ and the free version you mentioned above only returns a summary.

    Therefore I thought I’d just hit the eBay listing themselves!

  9. Thanks a lot for sharing…..

  10. have you done anything with youtube? am working on something right now :)

  11. @Jay, no. YouTube is my todo list :)

  12. ill share mine with you when im done with it :)

    when you get chance can you drop me an email id like to show something that you can use with your scraped content :)

  13. sent you email

Post a comment.