Attention-based Models
A beginner’ guide on attention-based models can be found here.
For learning on sequential inputs, attention-based methods can allow us to use variable input length.
In short, the idea is that in generating context \(c_i\), decoder RNN pay special attention on some but not all hidden states \(h_j\).
The attention mechanism is various. For example, different probabilities can be assigned to \(\textbf{h}\).