How To Create New Column Based On First Column Taking Into Account Size Of Letter And List In Python Pandas?
I have DataFrame in Python Pandas like below: col1 -------- John Simon prd agc Ann White BeN and Ann bad_list = ['Ben', 'Wayne'] And I need to ake something like: create new colu
Solution 1:
You can try via str.title()
,str.contains()
and astype()
method:
df['col2']=df['col1'].str.title().str.contains('|'.join(bad_list)).astype(int)
output of df
:
col1 col2
0 John Simon prd 0
1 agc Ann White 0
2 BeN and Ann 1
Step by step breakdown of code:
Since your list i.e bad_list contains word in format(Ist word is capital and rest all small) so we convert whole Series('col1') like that by using Series.str.title()
so now the Series('col1') looks like:
0 John Simon Prd
1 Agc Ann White
2 Ben And Ann
Name: col1, dtype: object
Then we use str.contains()
that gives us a boolean series after checking if any of the element inside bad_list is present in the row of Series('col1'):
0 False
1 False
2 True
Name: col1, dtype: bool
Note:
here the code inside contains()
method:
'|'.join(bad_list)
#giving you a string(output of above code):
'Ben|Wayne'
Finally we are typecasting boolean Series to int via astype()
method:
0 0
1 0
2 1
Name: col1, dtype: int32
OR
another way is to use IGNORECASE
flag from re
module as suggested by @seanbean in comments:
from re import IGNORECASE
df['col2']=df['col1'].str.contains('|'.join(bad_list), flags=IGNORECASE).astype(int)
Post a Comment for "How To Create New Column Based On First Column Taking Into Account Size Of Letter And List In Python Pandas?"