Python: Creating Row Breaks In A List For Openpyxl To Recognise In .xlsx
Solution 1:
I would recommend you write to your excel file as you go along. For each table row, create a list of lists containing any sub rows present. You can then use Python's zip_longest()
function to return a sub entry for each row with blanks where one list is shorter than another, for example:
from itertools import zip_longest
from bs4 import BeautifulSoup
import openpyxl
html = """
<table>
<tr>
<td><p>a</p><p>b</p></td>
<td><p>1</p><p>2</p><p>3</p></td>
<td><p>d</p></td>
</tr>
<tr>
<td><p>a</p><p>b</p></td>
<td><p>1</p><p>2</p><p>3</p></td>
<td><p>d</p></td>
</tr>
</table>
"""
soup = BeautifulSoup(html, "html.parser")
table = soup.table
wb = openpyxl.Workbook()
ws = wb.active
output_row = 1
for table_row in table.find_all('tr'):
cells = table_row.find_all('td')
row = [[row.text for row in cell.find_all('p')] for cell in cells]
for row_number, cells in enumerate(zip_longest(*row, fillvalue=""), start=output_row):
for col_number, value in enumerate(cells, start=1):
ws.cell(column=col_number, row=row_number, value=value)
output_row += len(cells)
wb.save('output.xlsx')
This would give you the following output:
The enumerate()
function can be used to give you an incrementing number for each entry in a list. This can be used to give you suitable row and column numbers for openpyxl cells.
Solution 2:
Currently you're still putting all values into the same cell even though you're adding line breaks.
What you need to do is add new rows for each word. These will need to be of the form [None, 'Pear']
if you want the values in the second column.
Post a Comment for "Python: Creating Row Breaks In A List For Openpyxl To Recognise In .xlsx"