Skip to content Skip to sidebar Skip to footer

Check If Two Pyspark Rows Are Equal

I am writing unit tests for a Spark job, and some of the outputs are named tuples: pyspark.sql.Row How can I assert their equality? actual = get_data(df) expected = Row(total=4, un

Solution 1:

Your code should work as written because according to the docs:

the fields will be sorted by names.

Nevertheless, another way is to use the asDict() method of the pySpark.sql.Row and compare them as dictionaries:

actual = get_data(df)
expected = Row(total=4, unique_ids=2)
self.assertEqual(actual.asDict(), expected.asDict())

Post a Comment for "Check If Two Pyspark Rows Are Equal"