Skip to content Skip to sidebar Skip to footer

How To Convert Numpy Array Elements To Spark Rdd Column Values

I'm getting ready to use the built-in CSV printing facility of the spark dataframe (not pandas). I have an IndexedRowMatrix already built. As such there are sparse array columns in

Solution 1:

A simple conversion to plain Python types and unpacking should do the trick:

Xirm.rows.map(lambda x: (lu[x.index], *x.vector.toArray().tolist()))

same as

Xirm.rows.map(lambda x: [lu[x.index]] + x.vector.toArray().tolist())

Post a Comment for "How To Convert Numpy Array Elements To Spark Rdd Column Values"