Scrape Google Resultstats With Python
I would like to get the estimated results number from google for a keyword. Im using Python3.3 and try to accomplish this task with BeautifulSoup and urllib.request. This is my sim
Solution 1:
If you haven't solved this problem yet, it looks like the reason BeautifulSoup is failing to find anything is that the resultStats never appear in the soup - your Request(page_google) is only returning JavaScript, not any search results that the JavaScript is dynamically loading in. You can verify this by adding a
print(soup)
command to your code and you will see that the resultStats div doesn't appear.
The following code:
import sys
from urllib2 import Request, urlopen
import urllib
from bs4 import BeautifulSoup
query = 'pokerbonus'
url = "http://www.google.de/search?q=%s" % urllib.quote_plus(query)
req_google = Request(url)
req_google.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
html_google = urlopen(req_google).read()
soup = BeautifulSoup(html_google)
scounttext = soup.find('div', id='resultStats')
print(scounttext)
Will print
<div class="sd"id="resultStats">Ungefähr 1.060.000 Ergebnisse</div>
Lastly, using a tool like Selenium Webdriver might be a better way to go about solving this, as Google does not allow bots to scrape search results.
Post a Comment for "Scrape Google Resultstats With Python"