Skip to content Skip to sidebar Skip to footer

How To Create New Column Based On First Column Taking Into Account Size Of Letter And List In Python Pandas?

I have DataFrame in Python Pandas like below: col1 -------- John Simon prd agc Ann White BeN and Ann bad_list = ['Ben', 'Wayne'] And I need to ake something like: create new colu

Solution 1:

You can try via str.title(),str.contains() and astype() method:

df['col2']=df['col1'].str.title().str.contains('|'.join(bad_list)).astype(int)

output of df:

    col1            col2
0   John Simon prd  0
1   agc Ann White   0
2   BeN and Ann     1  

Step by step breakdown of code:

Since your list i.e bad_list contains word in format(Ist word is capital and rest all small) so we convert whole Series('col1') like that by using Series.str.title() so now the Series('col1') looks like:

0    John Simon Prd
1     Agc Ann White
2       Ben And Ann
Name: col1, dtype: object

Then we use str.contains() that gives us a boolean series after checking if any of the element inside bad_list is present in the row of Series('col1'):

0    False
1    False
2     True
Name: col1, dtype: bool

Note:

here the code inside contains() method:

'|'.join(bad_list)
#giving you a string(output of above code):
'Ben|Wayne'

Finally we are typecasting boolean Series to int via astype() method:

0    0
1    0
2    1
Name: col1, dtype: int32

OR

another way is to use IGNORECASE flag from re module as suggested by @seanbean in comments:

from re import IGNORECASE

df['col2']=df['col1'].str.contains('|'.join(bad_list), flags=IGNORECASE).astype(int)

Post a Comment for "How To Create New Column Based On First Column Taking Into Account Size Of Letter And List In Python Pandas?"