Skip to content Skip to sidebar Skip to footer

How To Scrape Data Using Next Button With Ellipsis Using Scrapy

I need to continuously get the data on next button <1 2 3 ... 5> but there's no provided href link in the source also there's also elipsis. any idea please? here's my code de

Solution 1:

It seems this pagination uses additional request to API. So, there are two ways:

  1. Use Splash/Selenium to render pages by pattern of QHarr;
  2. Make same calls to API. Check developer tools, you will find POST-request will all proper params (they are too long, so I will not post full list here).

Solution 2:

The url changes so you can specify page number and results per page in the url e.g.,250

As mentioned by @vezunchik and OP feedback, this approach requires selenium/splash to allow js to run on the page. If you were going down that route you could just click the next ( .p_next) until you get the end page as it is easy to grab the last page number (.dot + .pageno)from the document.

I appreciate you are trying with scrapy.

Demo of the idea with selenium in case helps.

from selenium import webdriver
from import By
from import WebDriverWait
from import expected_conditions as EC

url_loop = '{}&pageSize=120&filter=price:0,250'
url = ''
d = webdriver.Chrome()

d.find_element_by_css_selector('[onclick="fnAcceptCookieUse()"]').click() #get rid of cookies
items =  WebDriverWait(d,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#products .p_item")))
d.find_elements_by_css_selector('.pagesize')[-1].click() #set page result count to 120
last_page = int(d.find_element_by_css_selector('.dot + .pageno').text) #get last pageif last_page > 1:
    for page inrange(2, last_page + 1):
        url = url_loop.format(page)
            d.find_element_by_css_selector('[type=reset]').click() #reject offerexcept:
            pass# do something with pagebreak#delete later

Post a Comment for "How To Scrape Data Using Next Button With Ellipsis Using Scrapy"