Skip to content Skip to sidebar Skip to footer

Itertools Within Web_crawler Giving Wrong Triples

I have written some code to parse name, link and price from craigslist. When I print the result, these are getting scraped as list. I tried like the pasted code below to get a work

Solution 1:

My inclination is to avoid the difficulties that could arise in trying to match collections from two xpath queries using a zip by doing a depth-first search and then examining each entry, as here.

import requests
from lxml import html

page = requests.get('http://bangalore.craigslist.co.in/search/rea?s=120').text
tree = html.fromstring(page)
rows = tree.xpath('.//li[@class="result-row"]')
for n, row in enumerate(rows):
    price = row.xpath('.//a/span/text()')[0][1:]
    link = row.xpath('.//p/a')[0]
    title = link.text
    url = link.attrib['href']
    print ('--->', title)
    print (price, ':', url)

Post a Comment for "Itertools Within Web_crawler Giving Wrong Triples"