Multi-Label Image Classification
Solution 1:
If your goal is to predict if 'L', 'M' and 'H', you are using an incorrect loss function. You should use binary_crossentropy
. The shape of your targets will be batch × 3 in this case.
categorical_crossentropy
assumes the output is a categorical distribution: a vector of values that sum up to one. In other words, you have multiple possibilities, but only of them can be the correct one.binary_crossentropy
assumes that every number from the output vector is a (conditionally) independent binary distribution, so each number is between 0 and 1, but they do not necessarily sum up to one, because it can very well happen that all of them true.
If your goal is to predict for each label1, ..., label6 the value, then you should model a categorical distribution for each of the labels. You have six labels, each of them has 3 values, you thus need 18 numbers (logits). The shape of your targets will be batch × 6 × 3 in this case.
model.add(Dense(18, activation='none'))
Because you don't want a single distribution over 18 values, but over 6 × 3 values, you need to reshape the logits first:
model.add(Reshape((6, 3))
model.add(Softmax())
Solution 2:
Base on the above discussion. Here is the solution for the above problem. As I mentioned we have a total of 5 labels and each label have further three tags like (L, M, H) We can perform encoding in this way
# create a one hot encoding for one list of tags
def custom_encode(tags, mapping):
# create empty vector
encoding=[]
for tag in tags:
if tag == 'L':
encoding.append([1,0,0])
elif tag == 'M':
encoding.append([0,1,0])
else:
encoding.append([0,0,1])
return encoding
So encoded y-vector will look like
**Labels Tags Encoded Tags**
Label1 ----> [L,L,L,M,H] ---> [ [1,0,0], [1,0,0], [1,0,0], [0,1,0], [0,0,1] ]
Label2 ----> [L,H,L,M,H] ---> [ [1,0,0], [0,0,1], [1,0,0], [0,1,0], [0,0,1] ]
Label3 ----> [L,M,L,M,H] ---> [ [1,0,0], [0,1,0], [1,0,0], [0,1,0], [0,0,1] ]
Label4 ----> [M,M,L,M,H] ---> [ [0,1,0], [0,1,0], [1,0,0], [0,1,0], [0,0,1] ]
Label5 ----> [M,L,L,M,H] ---> [ [0,1,0], [1,0,0], [1,0,0], [0,1,0], [0,0,1] ]
The final layer will be like
model.add(Dense(15)) #because we have total 5 labels and each has 3 tags so 15 neurons will be on final layer
model.add(Reshape((5,3))) # each 5 have further 3 tags we need to reshape it
model.add(Activation('softmax'))
Post a Comment for "Multi-Label Image Classification"