Itertools Within Web_crawler Giving Wrong Triples
I have written some code to parse name, link and price from craigslist. When I print the result, these are getting scraped as list. I tried like the pasted code below to get a work
Solution 1:
My inclination is to avoid the difficulties that could arise in trying to match collections from two xpath
queries using a zip
by doing a depth-first search and then examining each entry, as here.
import requests
from lxml import html
page = requests.get('http://bangalore.craigslist.co.in/search/rea?s=120').text
tree = html.fromstring(page)
rows = tree.xpath('.//li[@class="result-row"]')
for n, row in enumerate(rows):
price = row.xpath('.//a/span/text()')[0][1:]
link = row.xpath('.//p/a')[0]
title = link.text
url = link.attrib['href']
print ('--->', title)
print (price, ':', url)
Post a Comment for "Itertools Within Web_crawler Giving Wrong Triples"