Embedding in PyTorch

#python #pytorch #embedding #embeddinglayer

*Memos:

My post explains Embedding Layer.
My post explains manual_seed().
My post explains requires_grad.

Embedding() can get the 1D or more D tensor of the zero or more elements computed by Embedding from the 0D or more D tensor of one or more elements(indices) with or without the 2D tensor of zero or more elements(weights) as shown below:

*Memos:

The 1st argument for initialization is num_embeddings(Required-Type:int). *It must be 1 <= x.
The 2nd argument for initialization is embedding_dim(Required-Type:int). *It must be 0 <= x.
The 3rd argument for initialization is padding_idx(Optional-Default:None-Type:int).
The 4th argument for initialization is max_norm(Optional-Default:None-Type:float).
The 5th argument for initialization is norm_type(Optional-Default:2.0-Type:float). *It must be 0 <= x when returning an empty tensor.
The 6th argument for initialization is scale_grad_by_freq(Optional-Default:False-Type:bool).
The 7th argument for initialization is sparse(Optional-Default:False-Type:bool).
The 8th argument for initialization is _weight(Optional-Default:None-Type:tensor of float): *Memos:
- If None, weight is randomly generated.
- It must be the 2D tensor of zero or more elements.
- Its size must be same as num_embeddings and embedding_dim.
The 9th argument for initialization is _freeze(Optional-Default:False-Type:bool). *If it's False, requires_grad is True while if it's True, requires_grad is False.
The 10th argument for initialization is device(Optional-Defalut:None-Type:str, int or device()): *Memos:
- If it's None, get_default_device() is used. *My post explains get_default_device() and set_default_device().
- device= can be omitted.
- My post explains device argument.
The 11th argument for initialization is dtype(Optional-Default:None-Type:dtype): *Memos:
- If it's None, get_default_dtype() is used. *My post explains get_default_dtype() and set_default_dtype().
- dtype= can be omitted.
- My post explains dtype argument.
The 1st argument is input(Required-Type:tensor of int): *Memos:
- It's indices.
- Indices must be less than num_embeddings.
- It must be the 0D or more D tensor of one or more elements.
- Its device must be same as Embedding()'s.
embedding.device and embedding.dtype don't work.

import torch
from torch import nn

tensor1 = torch.tensor([6, 0, 2, 5]) # Indices

tensor1.requires_grad
# False

torch.manual_seed(42)

embedding = nn.Embedding(num_embeddings=7, embedding_dim=3)
tensor2 = embedding(input=tensor1)
tensor2
# tensor([[0.8034, -0.6216, -0.5920],
#         [1.9269, 1.4873, 0.9007],
#         [0.8008, 1.6806, 0.3559],
#         [0.8599, -0.3097, -0.3957]], grad_fn=<EmbeddingBackward0>)

tensor2.requires_grad
# True

embedding
# Embedding(7, 3)

embedding.num_embeddings
# 7

embedding.embedding_dim
# 3

embedding.padding_idx
# None

embedding.max_norm
# None

embedding.norm_type
# 2.0

embedding.scale_grad_by_freq
# False

embedding.sparse
# False

embedding.weight
# Parameter containing:
# tensor([[1.9269, 1.4873, 0.9007],
#         [-2.1055, 0.6784, 1.0783],
#         [0.8008, 1.6806, 0.3559],
#         [-0.6866, -0.4934, 0.2415],
#         [-1.1109, 0.0418, -0.2516],
#         [0.8599, -0.3097, -0.3957],
#         [0.8034, -0.6216, -0.5920]], requires_grad=True)

torch.manual_seed(42)

embedding = nn.Embedding(num_embeddings=7, embedding_dim=3,
                         padding_idx=None, max_norm=None, norm_type=2.0,
                         scale_grad_by_freq=False, sparse=False,
                         _weight=None, _freeze=False,
                         device=None, dtype=None)
embedding(input=tensor1)
# tensor([[0.8034, -0.6216, -0.5920],
#         [1.9269, 1.4873, 0.9007],
#         [0.8008, 1.6806, 0.3559],
#         [0.8599, -0.3097, -0.3957]], grad_fn=<EmbeddingBackward0>)

weight = torch.tensor([[4., 9., 1.],
                       [-2., 0., 3.],
                       [0., 5., 7.],
                       [8., -6., 0.],
                       [1., 3., -9.],
                       [-3., 1., 2.],
                       [-5., 7., -4.]])
embedding = nn.Embedding(num_embeddings=7, embedding_dim=3,
                         _weight=weight)
embedding(input=tensor1)
# tensor([[-5., 7., -4.],
#         [4., 9., 1.],
#         [0., 5., 7.],
#         [-3., 1., 2.]], grad_fn=<EmbeddingBackward0>)

my_tensor = torch.tensor([[6, 0], # Indices
                          [2, 5]])
torch.manual_seed(42)

embedding = nn.Embedding(num_embeddings=7, embedding_dim=3)
embedding(input=my_tensor)
# tensor([[[0.8034, -0.6216, -0.5920],
#          [1.9269, 1.4873, 0.9007]],
#         [[0.8008, 1.6806, 0.3559],
#          [0.8599, -0.3097, -0.3957]]], grad_fn=<EmbeddingBackward0>)

my_tensor = torch.tensor([[[6], [0]], # Indices
                          [[2], [5]]])
torch.manual_seed(42)

embedding = nn.Embedding()
embedding(input=my_tensor)
# tensor([[[[0.8034, -0.6216, -0.5920]],
#          [[1.9269, 1.4873, 0.9007]]],
#         [[[0.8008, 1.6806, 0.3559]],
#          [[0.8599, -0.3097, -0.3957]]]], grad_fn=<EmbeddingBackward0>)

Embedding.from_pretrained() can get the 1D or more D tensor of the zero or more elements computed by Embedding from the 0D or more D tensor of one or more elements(indices) with the 2D tensor of zero or more elements(weights) as shown below:

*Memos:

The 1st argument for initialization is embeddings(Required-Type:tensor of int, float or complex): *Memos:
- It's weight.
- It must be the 2D tensor of zero or more elements.
The 2nd argument for initialization is freeze(Optional-Default:True-Type:bool). *If it's False, requires_grad is True while if it's True, requires_grad is False.
The 3rd argument for initialization is padding_idx(Optional-Default:None-Type:int).
The 4th argument for initialization is max_norm(Optional-Default:None-Type:float). *It's must be None, if embeddings is an empty tensor and norm_type is negative.
The 5th argument for initialization is norm_type(Optional-Default:2.0-Type:float).
The 6th argument for initialization is scale_grad_by_freq(Optional-Default:False-Type:bool).
The 7th argument for initialization is sparse(Optional-Default:False-Type:bool).
The 1st argument is input(Required-Type:tensor of int): *Memos:
- It's indices.
- Indices must be less than embeddings of the number of the 1st dimension.
- It must be the 0D or more D tensor of one or more elements.

import torch
from torch import nn

weight = torch.tensor([[4., 9., 1.],
                       [-2., 0., 3.],
                       [0., 5., 7.],
                       [8., -6., 0.],
                       [1., 3., -9.],
                       [-3., 1., 2.],
                       [-5., 7., -4.]])

tensor1 = torch.tensor([6, 0, 2, 5]) # Indices

tensor1.requires_grad
# False

embedding = nn.Embedding.from_pretrained(embeddings=weight)
tensor2 = embedding(input=tensor1)
tensor2
# tensor([[-5., 7., -4.],
#         [4., 9., 1.],
#         [0., 5., 7.],
#         [-3., 1., 2.]])

tensor2.requires_grad
# False

embedding
# Embedding(7, 3)

embedding.num_embeddings
# 7

embedding.embedding_dim
# 3

embedding.padding_idx
# None

embedding.max_norm
# None

embedding.norm_type
# 2.0

embedding.scale_grad_by_freq
# False

embedding.sparse
# False

embedding.weight
# Parameter containing:
# tensor([[4., 9., 1.],
#         [-2., 0., 3.],
#         [0., 5., 7.],
#         [8., -6., 0.],
#         [1., 3., -9.],
#         [-3., 1., 2.],
#         [-5., 7., -4.]])

embedding = nn.Embedding.from_pretrained(embeddings=weight, freeze=True, 
                         padding_idx=None, max_norm=None, norm_type=2.0,
                         scale_grad_by_freq=False, sparse=False)
embedding(input=tensor1)
# tensor([[-5., 7., -4.],
#         [4., 9., 1.],
#         [0., 5., 7.],
#         [-3., 1., 2.]])

my_tensor = torch.tensor([[6, 0], # Indices
                          [2, 5]])
embedding = nn.Embedding.from_pretrained(embeddings=weight)
embedding(input=my_tensor)
# tensor([[[-5., 7., -4.],
#          [4., 9., 1.]],
#         [[0., 5., 7.],
#          [-3., 1., 2.]]])

my_tensor = torch.tensor([[[6], [0]], # Indices
                          [[2], [5]]])
embedding = nn.Embedding.from_pretrained(embeddings=weight)
embedding(input=my_tensor)
# tensor([[[[-5., 7., -4.]],
#          [[4., 9., 1.]]],
#         [[[0., 5., 7.]],
#          [[-3., 1., 2.]]]])

weight = torch.tensor([[4, 9, 1],
                       [-2, 0, 3],
                       [0, 5, 7],
                       [8, -6, 0],
                       [1, 3, -9],
                       [-3, 1, 2],
                       [-5, 7, -4]])
embedding = nn.Embedding.from_pretrained(embeddings=weight)
embedding(input=my_tensor)
# tensor([[[[-5, 7, -4]],
#          [[4, 9, 1]]],
#         [[[0, 5, 7]],
#          [[-3, 1, 2]]]])

weight = torch.tensor([[4.+0.j, 9.+0.j, 1.+0.j],
                       [-2.+0.j, 0.+0.j, 3.+0.j],
                       [0.+0.j, 5.+0.j, 7.+0.j],
                       [8.+0.j, -6.+0.j, 0.+0.j],
                       [1.+0.j, 3.+0.j, -9.+0.j],
                       [-3.+0.j, 1.+0.j, 2.+0.j],
                       [-5.+0.j, 7.+0.j, -4.+0.j]])
embedding = nn.Embedding.from_pretrained(embeddings=weight)
embedding(input=my_tensor)
# tensor([[[[-5.+0.j, 7.+0.j, -4.+0.j]],
#          [[4.+0.j, 9.+0.j, 1.+0.j]]],
#         [[[0.+0.j, 5.+0.j, 7.+0.j]],
#          [[-3.+0.j, 1.+0.j, 2.+0.j]]]])

weight = torch.tensor([[True, False, True],
                       [False, True, False],
                       [True, False, True],
                       [False, True, False],
                       [True, False, True],
                       [False, True, False],
                       [True, False, True]])
embedding = nn.Embedding.from_pretrained(embeddings=weight)
embedding(input=my_tensor)
# tensor([[[[True, False, True]],
#          [[True, False, True]]],
#         [[[True, False, True]],
#          [[False, True, False]]]])

DEV Community

Embedding in PyTorch

Top comments (0)