Skip to content Skip to sidebar Skip to footer

Can't Beautifulsoup Show Me The Content Of The Website?

I want to scrape the contents of a website, using the library called BeautifulSoup. Code: from bs4 import BeautifulSoup from urllib.request import urlopen html_http_response = url

Solution 1:

This website uses cookies to validate the requests. If you the website for the first time, you need to check I'm not Robot option. So it passes incap_ses_415_965359, PHPSESSID, visid_incap_965359, _ga and _gid values on the header of the requests and sends it.

So, I got cookies from chrome dev tool and saved it in a dictionary.

 from bs4 import BeautifulSoup
import requests

cookies = {
     'incap_ses_415_965359':'djRha9OqhshstDcXvPV8cmHCBQGBKloAAAAAN3/D9dvoqwEc7GPEwefkhQ==', 'PHPSESSID':'fjmr7plc0dmocm8roq7togcp92', 'visid_incap_965359':'akteT8lDT1iyST7XJO7wdQGBKloAAAns;aAAQkIPAAAAAACAWbWAAQ6Ozzrln35KG6DhLXMRYnMjxOmY', '_ga':'GA1.2.894579844.151uus2734989', '_gid':"GA1.2.1055878562.1598994989"
}
html_http_response = requests.get("http://www.airlinequality.com/airport-reviews/jeddah-airport", cookies=cookies)
data = html_http_response.text
soup = BeautifulSoup(data, "html.parser")
print(soup.prettify())

Get cookie values from your browser and update it

Solution 2:

The data you are looking for , don't exist yet cause this page has Java Jenerated Data. You must study on selenium library and you will find it ( it's rather easy). This means that the data you want only created when you actually load the page and click e.g. search button.(keep in mind that in iframes first you must select them).

Post a Comment for "Can't Beautifulsoup Show Me The Content Of The Website?"