Skip to content Skip to sidebar Skip to footer

Scrape Google Resultstats With Python

I would like to get the estimated results number from google for a keyword. Im using Python3.3 and try to accomplish this task with BeautifulSoup and urllib.request. This is my sim

Solution 1:

If you haven't solved this problem yet, it looks like the reason BeautifulSoup is failing to find anything is that the resultStats never appear in the soup - your Request(page_google) is only returning JavaScript, not any search results that the JavaScript is dynamically loading in. You can verify this by adding a

print(soup)

command to your code and you will see that the resultStats div doesn't appear.

The following code:

import sys                                                                                                                                                                  
from urllib2 import Request, urlopen                                                                                                                                        
import urllib                                                                                                                                                               
from bs4 import BeautifulSoup                                                                                                                                               
query = 'pokerbonus'                                                                                                                                                        
url = "http://www.google.de/search?q=%s" % urllib.quote_plus(query)                                                                                                         
req_google = Request(url)                                                                                                                                                   
req_google.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB;    rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')                                           
html_google = urlopen(req_google).read()                                                                                                                                    
soup = BeautifulSoup(html_google)                                                                                                                                           
scounttext = soup.find('div', id='resultStats')                                                                                                                             
print(scounttext)

Will print

<div class="sd"id="resultStats">Ungefähr 1.060.000 Ergebnisse</div>

Lastly, using a tool like Selenium Webdriver might be a better way to go about solving this, as Google does not allow bots to scrape search results.

Post a Comment for "Scrape Google Resultstats With Python"