Blank List Returned When Using Xpath With Morningstar Key Ratios
I am trying to pull a piece of data from the morningstar key ratio page for any given stock using XPath. I have the full path that returns a result in the XPath Helper tooldbar add
Solution 1:
This is one of those pages that downloads much of its content in stages. If you look for the item you want after using just requests
you will find that it's not yet available, as shown here.
>>>import requests>>>url = 'http://financials.morningstar.com/ratios/r.html?t=AMD®ion=USA&culture=en_US'>>>page = requests.get(url).text>>>'5,858'in page
False
One strategy for processing these pages involves the use of the selenium library. Here, selenium launches a copy of the Chrome browser, loads that url then uses an xpath expression to locate the td
element of interest. Finally, the number you want becomes available as the text
property of that element.
>>>from selenium import webdriver>>>driver = webdriver.Chrome()>>>driver.get(url)>>>td = driver.find_element_by_xpath('.//th[@id="i0"]/td[1]')
<selenium.webdriver.remote.webelement.WebElement (session="f436b07c27742abb36b262639245801f", element="0.12745670001529863-2")>
>>>td.text
'5,858'
Solution 2:
As the content of that page is generated dynamically so you can either go through the process as Bill Bell shows already, or you can grab the page source then apply css selector on it to get the desired value. Here is an alternative to xpath:
from lxml import html
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('http://financials.morningstar.com/ratios/r.html?t=AMD®ion=USA&culture=en_US')
tree = html.fromstring(driver.page_source)
driver.quit()
rev = tree.cssselect('td[headers^=Y0]')[0].text
print(rev)
Result:
5,858
Post a Comment for "Blank List Returned When Using Xpath With Morningstar Key Ratios"