# Attention based Models

For learning on sequential inputs, attention-based methods can allow us to use variable input length.

In short, the idea is that in generating context $$c_i$$, decoder RNN pay special attention on some but not all hidden states $$h_j$$.

The attention mechanism is various. For example, different probabilities can be assigned to $$\textbf{h}$$.