Keras Attention Layer Over Lstm

March 08, 2024 Post a Comment

I'm using keras 1.0.1 I'm trying to add an attention layer on top of an LSTM. This is what I have so far, but it doesn't work. input_ = Input(shape=(input_length, input_dim)) lstm

Solution 1:

The first piece of code you have shared is incorrect. The second piece of code looks correct except for one thing. Do not use TimeDistributed as the weights will be the same. Use a regular Dense layer with a non linear activation.

input_ = Input(shape=(input_length, input_dim))
    lstm = GRU(self.HID_DIM, input_dim=input_dim, input_length = input_length, return_sequences=True)(input_)
    att = Dense(1, activation='tanh')(lstm_out )
    att = Flatten()(att)
    att = Activation(activation="softmax")(att)
    att = RepeatVector(self.HID_DIM)(att)
    att = Permute((2,1))(att)
    mer = merge([att, lstm], "mul")

Now you have the weight adjusted states. How you use it is up to you. Most versions of Attention I have seen, just add these up over the time axis and then use the output as the context.

Baca Juga

Python Playground

Keras Attention Layer Over Lstm

Solution 1:

Post a Comment for "Keras Attention Layer Over Lstm"