Joining Dataframes With Same Coumn Name In Pyspark
I have two dataframe which has been readed from two csv files. +---+----------+-----------------+ | ID| NUMBER | RECHARGE_AMOUNT| +---+----------+-----------------+ | 1|9090909
Solution 1:
You can select the columns from each dataframe and alias it.
Like this.
dfFinal = dfFinal.join(df2, on=['NUMBER'], how='inner') \
.select('NUMBER',
dfFinal.ID.alias('ID_1'),
dfFinal.RECHARGE_AMOUNT.alias('RECHARGE_AMOUNT_1'),
df2.ID.alias('ID_2'),
df2.RECHARGE_AMOUNT.alias('RECHARGE_AMOUNT_2'))
Post a Comment for "Joining Dataframes With Same Coumn Name In Pyspark"