How To Convert Numpy Array Elements To Spark Rdd Column Values
I'm getting ready to use the built-in CSV printing facility of the spark dataframe (not pandas). I have an IndexedRowMatrix already built. As such there are sparse array columns in
Solution 1:
A simple conversion to plain Python types and unpacking should do the trick:
Xirm.rows.map(lambda x: (lu[x.index], *x.vector.toArray().tolist()))
same as
Xirm.rows.map(lambda x: [lu[x.index]] + x.vector.toArray().tolist())
Post a Comment for "How To Convert Numpy Array Elements To Spark Rdd Column Values"