Can't Beautifulsoup Show Me The Content Of The Website?
I want to scrape the contents of a website, using the library called BeautifulSoup. Code: from bs4 import BeautifulSoup from urllib.request import urlopen html_http_response = url
Solution 1:
This website uses cookies to validate the requests. If you the website for the first time, you need to check I'm not Robot
option. So it passes incap_ses_415_965359, PHPSESSID, visid_incap_965359, _ga and _gid values on the header of the requests and sends it.
So, I got cookies from chrome dev tool and saved it in a dictionary.
from bs4 import BeautifulSoup
import requests
cookies = {
'incap_ses_415_965359':'djRha9OqhshstDcXvPV8cmHCBQGBKloAAAAAN3/D9dvoqwEc7GPEwefkhQ==', 'PHPSESSID':'fjmr7plc0dmocm8roq7togcp92', 'visid_incap_965359':'akteT8lDT1iyST7XJO7wdQGBKloAAAns;aAAQkIPAAAAAACAWbWAAQ6Ozzrln35KG6DhLXMRYnMjxOmY', '_ga':'GA1.2.894579844.151uus2734989', '_gid':"GA1.2.1055878562.1598994989"
}
html_http_response = requests.get("http://www.airlinequality.com/airport-reviews/jeddah-airport", cookies=cookies)
data = html_http_response.text
soup = BeautifulSoup(data, "html.parser")
print(soup.prettify())
Get cookie values from your browser and update it
Solution 2:
The data you are looking for , don't exist yet cause this page has Java Jenerated Data. You must study on selenium library and you will find it ( it's rather easy). This means that the data you want only created when you actually load the page and click e.g. search button.(keep in mind that in iframes first you must select them).
Post a Comment for "Can't Beautifulsoup Show Me The Content Of The Website?"