# Embedding

A simple blog on embedding can be found here and Google’s tutorial is here.

An embedding is a mapping of a discrete - categorical - variable to a vector of continuous numbers. Embeddings are useful because they can *reduce the dimensionality* of categorical variables and *meaningfully represent* categories in the transformed space.

Embeddings have three primary purposes:

- Finding nearest neighbors in the embedding space. These can be used to make recommendations based on user interests or cluster categories.
- As input to a machine learning model for a supervised task.
- For visualization of concepts and relations between categories.

One-hot encoding can be considered as a naive embedding method, and it has two obvious drawbacks: *dimensions* and *uninformed*.

## Learning Embedding

Embedding can be learned with a supervised neural network, where we create some loss function to make the similar entities’ representations close in the embedded space.

t-Distributed Stochastic Neighbor Embedding (TSNE) is an embedding method for visualization.

One hyperparameter is the dimensions of the embedding, an empirical rule-of-thumb is

\[dimensions\approx \sqrt[4]{possible\ values}\]