Expand And Flatten A Ragged Nested List
Solution 1:
I've got a simple solution for the "same structure" case, using a recursive generator and the izip_longest
function from itertools
. This code is for Python 2, but with a few tweaks (noted in comments) it can be made to work on Python 3:
from itertools import izip_longest # in py3, this is renamed zip_longestdefflatten(nested_list):
returnzip(*_flattengen(nested_list)) # in py3, wrap this in list()def_flattengen(iterable):
for element in izip_longest(*iterable, fillvalue=""):
ifisinstance(element[0], list):
for e in _flattengen(element):
yield e
else:
yield element
In Python 3.3 it will become even simpler, thanks to PEP 380 which will allow the recursive step, for e in _flatengen(element): yield e
, to become yield from _flattengen(element)
.
Solution 2:
Actually there is no the solution for generic case where the structure is not the same.
For example a normal algorithm would match ["bla"]
with ["a", "b", "c"]
, and the result will be
[ [ "id1", x, y, z, 1, 2, "", "a", "b", "c", "", "", ""],
[ "id2", x, y, z, 1, 2, 3, "bla", "", "", "", "a", "b"],
[ "id3", x, y, "", 1, 2, 3, "a", "b", "c", "", "", ""]]
But if you know you will have a number of rows, each starting with an ID an followed by a nested list structure, the algorithm below should work:
import itertools
defnormalize(l):
# just hack the first item to have only lists of lists or lists of itemsfor sublist in l:
sublist[0] = [sublist[0]]
# break the nestingdefflatten(l):
for item in l:
ifnotisinstance(item, list) or0 == len([x for x in item ifisinstance(x, list)]):
yield item
else:
for subitem in flatten(item):
yield subitem
l = [list(flatten(i)) for i in l]
# extend all lists to greatest length
list_lengths = { }
for i inrange(0, len(l[0])):
for item in l:
list_lengths[i] = max(len(item[i]), list_lengths.get(i, 0))
for i inrange(0, len(l[0])):
for item in l:
item[i] += [''] * (list_lengths[i] - len(item[i]))
# flatten each rowreturn [list(itertools.chain(*sublist)) for sublist in l]
l = [ [ "id1", [["x", "y", "z"], [1, 2]], ["a", "b", "c"]],
[ "id2", [["x", "y", "z"], [1, 2, 3]], ["a", "b"]],
[ "id3", [["x", "y"], [1, 2, 3]], ["a", "b", "c", ""]] ]
l = normalize(l)
print l
Solution 3:
defrecursive_pad(l, spacer=""):
# Make the function never modify it's arguments.
l = list(l)
is_list = lambda x: isinstance(x, list)
are_subelements_lists = map(is_list, l)
ifnotany(are_subelements_lists):
return l
# Would catch [[], [], "42"]ifnotall(are_subelements_lists) andany(are_subelements_lists):
raise Exception("Cannot mix lists and non-lists!")
lengths = map(len, l)
ifmax(lengths) == min(lengths):
#We're already donereturn l
# Pad it outmap(lambda x: list_pad(x, spacer, max(lengths)), l)
return l
deflist_pad(l, spacer, pad_to):
for i inrange(len(l), pad_to):
l.append(spacer)
if __name__ == "__main__":
print(recursive_pad([[[[["x", "y", "z"], [1, 2]], ["a", "b", "c"]], [[[x, y, z], [1, 2, 3]], ["a", "b"]], [[["x", "y"], [1, 2, 3]], ["a", "b", "c", ""]] ]))
Edit: Actually, I misread your question. This code solve a slightly different problem
Post a Comment for "Expand And Flatten A Ragged Nested List"