23Apr/092
Live Search, Ask and Cuil SERP scraping
The SERP scraping saga continues. This time I'll give you required regexps and URLs only. No need to copy paste code from other scraping posts. Feel free to improve regexps and comment.
Live Search
- URL - http://search.live.com/results.aspx?q=[keyword]
- regexp - "(<h3><a href=\"(.*)\".*>(.*)</a></h3>)siU"
Ask
- URL - http://www.ask.com/web?q=[keyword]
- regexp - "(<tr>.*<td>.*<a id=\"r\d+_t\" href=\"(.*)\".*>(.*)</a>.*</td>.*</tr>)siU"
Cuil
- URL - http://www.cuil.com/search?q=[keyword]
- regexp - "(<h2 class=\"t\"><a.*href=\"(.*)\".*>(.*)</a></h2>)siU"
Sometimes Cuil puts Timeline feature in the SERP. The regular expression above matches it, however you don't need that. The Timeline is easy to find in URLs array – search for href="http://#". Don't forget to delete relevant element from anchor's array.
Happy scrapping!
November 18th, 2009 - 15:47
Hello, “Live Search” is a link to “Bing” and
“Bing” does not allow to scrap any information…
Test: fopen url <- fail (file_get_contents)
Test: curl <- fail
November 19th, 2009 - 09:29
This http://www.fromzerotoseo.com/scraping-bing-serp/