Webscraping With A Loop Returns Only A Single Element

Question

When I run a for loop to collect elements within a

tag it only returns the first from a list of all with the same class. For example: r = requests.get('https://one-vers

Solution 1:

You can try this:

import requests
from bs4 import BeautifulSoup

r = requests.get("https://one-versus-one.com/en/rankings/all/statistics")
soup = BeautifulSoup(r.content, 'lxml')

data = {'players': [],'club': [],'rank': []}

def getstuff(soup):
    products = soup.find('div', {'class':'rankings-table'}).find_all("a")
    for name in products:
        players = name.find('div', {'class':'player-name rankings-table__player-name'}).text
        club = name.find('span', {'class':'rankings-table__club-name'}).text
        rank = name.find('div', {'class':'rankings-table-cell value rankings-table__value'}).text.strip()
        data['players'].append(players)
        data['club'].append(club)
        data['rank'].append(rank)
    print(data)

getstuff(soup)
"""
{'players': ['Lionel Messi', 'Junior Neymar', 'Robert Lewandowski', 'Joao Cancelo', 'Kevin de Bruyne', 'Rodri', 'Jesse Lingard', 'Riyad Mahrez', 'Ilkay Gundogan', 'John Stones'], 'club': ['Barcelona', 'Paris Saint-Germain', 'Bayern Munich', 'Manchester City', 'Manchester City', 'Manchester City', 'West Ham United', 'Manchester City', 'Manchester City', 'Manchester City'], 'rank': ['100', '95', '93', '92', '91', '90', '90', '89', '88', '88']}
"""

You have to use .find_all("a") to get info about all players. And additional you're just making adding new player in data['players'] insted of adding new player and for club, rank same.

Solution 2:

You are overwriting the variable within each loop rather than appending to a data set. Also, your products search only had one player within it.

Try

data = []


products = soup.select('a .rankings-table-row')
for name in products:
    players = name.find('div', {'class':'player-name rankings-table__player-name'}).text
    club = name.find('span', {'class':'rankings-table__club-name'}).text
    rank = name.find('div', {'class':'rankings-table-cell value rankings-table__value'}).text.strip()

    data.append(
        {
         'Players':  players,
         'Club': club,
         'Rank': rank
        }            
        )
data = pd.DataFrame(data)

Solution 3:

You should try

data['players'].append(players)

It's a list so appending should work. A list can be added to only by appending so if you do

data['players'] = players

it would assign the 'players' key to only a single value. Likewise for the other keys

The answer below me also mentions that you should use 'find_all'.

Solution 4:

I tried solving this too but with selenium. I even used an explicit wait : WebDriverWait, to make sure the element loads.

and still only Messi returns, none of the other players. The elements exist as entries but when trying to access their ".text" they return blank. Have people above tried their suggested solutions ?

Python Playground

Webscraping With A Loop Returns Only A Single Element

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Post a Comment for "Webscraping With A Loop Returns Only A Single Element"