*Memos:
- My post explains GRU layer.
- My post explains RNN().
- My post explains LSTM().
- My post explains Transformer().
- My post explains manual_seed().
- My post explains requires_grad.
GRU() can get the two 2D or 3D tensors of the one or more elements computed by GRU from the 2D or 3D tensor of zero or more elements as shown below:
*Memos:
- The 1st argument for initialization is
input_size(Required-Type:int). *It must be0 <= x. - The 2nd argument for initialization is
hidden_size(Required-Type:int). *It must be1 <= x. - The 3rd argument for initialization is
num_layers(Optional-Default:1-Type:int): *Memos:- It must be
1 <= x. - It must be
1 < x, ifdropoutis0 < x. - Its number is same as the number of
bias_ih_lx,bias_ih_lx_reverse,bias_hh_lx,bias_hh_lx_reverse,weight_ih_lx,weight_ih_lx_reverse,weight_hh_lxandweight_hh_l0_reverseso if it's3, there arebias_ih_l0,bias_ih_l1,bias_ih_l2,bias_ih_l0_reverse,bias_ih_l1_reverse,bias_ih_l2_reverse,bias_hh_l0,bias_hh_l1etc.
- It must be
- The 4th argument for initialization is
bias(Optional-Default:True-Type:bool). *My post explainsbiasargument. - The 5th argument for initialization is
batch_first(Optional-Default:False-Type:bool). - The 6th argument for initialization is
dropout(Optional-Default:0.0-Type:intorfloat). *It must be0 <= x <= 1. - The 7th argument for initialization is
bidirectional(Optional-Default:False-Type:bool). *If it'sTrue,bias_ih_lx_reverse,bias_hh_lx_reverse,weight_ih_lx_reverseandweight_hh_lx_reverseare added. - The 8th argument for initialization is
device(Optional-Default:None-Type:str,intor device()): *Memos:- If it's
None, get_default_device() is used. *My post explainsget_default_device()and set_default_device(). -
device=must be used. -
My post explains
deviceargument.
- If it's
- The 9th argument for initialization is
dtype(Optional-Default:None-Type:int): *Memos:- If it's
None, get_default_dtype() is used. *My post explainsget_default_dtype()and set_default_dtype(). -
dtype=must be used. -
My post explains
dtypeargument.
- If it's
- The 1st argument is
input(Required-Type:tensoroffloatorcomplex): *Memos:- It must be the 2D or 3D tensor of zero or more elements.
- The number of the elements of the deepest dimension must be same as
input_size. - Its
deviceanddtypemust be same asGRU()'s. -
complexmust be set todtypeofGRU()to use acomplextensor. - The tensor's
requires_gradwhich isFalseby default is set toTruebyGRU().
- The 2nd argument is
hx(Optional-Default:None-Type:tensoroffloatorcomplex). *Its D,deviceanddtypemust be same asinput's. -
gru1.deviceandgru1.dtypedon't work.
import torch
from torch import nn
tensor1 = torch.tensor([[8., -3., 0., 1., 5., -2.]])
tensor1.requires_grad
# False
torch.manual_seed(42)
gru1 = nn.GRU(input_size=6, hidden_size=3)
tensor2 = gru1(input=tensor1)
tensor2
# (tensor([[2.0241e-01, 7.7852e-01, -1.7867e-04]], grad_fn=<SqueezeBackward1>),
# tensor([[2.0241e-01, 7.7852e-01, -1.7867e-04]], grad_fn=<SqueezeBackward1>))
tensor2[0].requires_grad
tensor2[1].requires_grad
# True
gru1
# GRU(6, 3)
gru1.input_size
# 6
gru1.hidden_size
# 3
gru1.num_layers
# 1
gru1.bias
# True
gru1.batch_first
# False
gru1.dropout
# 0.0
gru1.bidirectional
# False
gru1.bias_ih_l0
# Parameter containing:
# tensor([-0.3936, 0.3063, -0.2334, 0.3504, -0.1370, 0.3303,
# -0.4486, -0.2914, 0.1760], requires_grad=True)
gru1.bias_hh_l0
# Parameter containing:
# tensor([0.1221, -0.1472, 0.3441, 0.3925, -0.4187, -0.3082,
# 0.5287, -0.1948, -0.2047], requires_grad=True)
gru1.weight_ih_l0
# Parameter containing:
# tensor([[0.4414, 0.4792, -0.1353, 0.5304, -0.1265, 0.1165],
# [-0.2811, 0.3391, 0.5090, -0.4236, 0.5018, 0.1081],
# [ 0.4266, 0.0782, 0.2784, -0.0815, 0.4451, 0.0853],
# [-0.2695, 0.1472, -0.2660, -0.0677, -0.2345, 0.3830],
# [-0.4557, -0.2662, -0.1630, -0.3471, 0.0545, -0.5702],
# [0.5214, -0.4904, 0.4457, 0.0961, -0.1875, 0.3568],
# [0.0900, 0.4665, 0.0631, -0.1821, 0.1551, -0.1566],
# [0.2430, 0.5155, 0.3337, -0.2524, 0.3333, 0.1033],
# [0.2932, -0.3519, -0.5715, -0.2231, -0.4428, 0.4737]],
# requires_grad=True)
gru1.weight_hh_l0
# Parameter containing:
# tensor([[0.1663, 0.2391, 0.1826],
# [-0.0100, 0.4518, -0.4102],
# [0.0364, -0.3941, 0.1780],
# [-0.1988, 0.1769, -0.1203],
# [0.4788, -0.3422, -0.3443],
# [-0.3444, 0.5193, 0.1924],
# [0.5556, -0.4765, -0.5727],
# [-0.4517, -0.3884, 0.2339],
# [0.2067, 0.4797, -0.2982]], requires_grad=True)
torch.manual_seed(42)
gru2 = nn.GRU(input_size=3, hidden_size=3)
gru2(input=tensor2[0])
gru2(input=tensor2[1])
# (tensor([[-0.0353, -0.1176, -0.0075]], grad_fn=<SqueezeBackward1>),
# tensor([[-0.0353, -0.1176, -0.0075]], grad_fn=<SqueezeBackward1>))
torch.manual_seed(42)
gru = nn.GRU(input_size=6, hidden_size=3, num_layers=1, bias=True,
batch_first=False, dropout=0.0, bidirectional=False,
device=None, dtype=None)
gru(input=tensor1)
gru(input=tensor1, hx=None)
gru(input=tensor1, hx=torch.tensor([[0., 0., 0.]]))
# (tensor([[2.0241e-01, 7.7852e-01, -1.7867e-04]], grad_fn=<SqueezeBackward1>),
# tensor([[2.0241e-01, 7.7852e-01, -1.7867e-04]], grad_fn=<SqueezeBackward1>))
my_tensor = torch.tensor([[8., -3., 0.],
[1., 5., -2.]])
torch.manual_seed(42)
gru = nn.GRU(input_size=3, hidden_size=3)
gru(input=my_tensor)
# (tensor([[-0.9818, 0.0099, -0.9303],
# [-0.8144, -0.3175, -0.9072]], grad_fn=<SqueezeBackward1>),
# tensor([[-0.8144, -0.3175, -0.9072]], grad_fn=<SqueezeBackward1>))
my_tensor = torch.tensor([[8.], [-3.], [0.],
[1.], [5.], [-2.]])
torch.manual_seed(42)
gru = nn.GRU(input_size=1, hidden_size=3)
gru(input=my_tensor)
# (tensor([[-0.0124, 0.7643, 0.4491],
# [0.6094, 0.1279, -0.4519],
# [0.4007, 0.1906, 0.2209],
# [0.1384, 0.4168, 0.6392],
# [0.0652, 0.7779, 0.8551],
# [0.3951, 0.2618, -0.0059]], grad_fn=<SqueezeBackward1>),
# tensor([[0.3951, 0.2618, -0.0059]], grad_fn=<SqueezeBackward1>))
my_tensor = torch.tensor([[[8.], [-3.], [0.]],
[[1.], [5.], [-2.]]])
torch.manual_seed(42)
gru = nn.GRU(input_size=1, hidden_size=3)
gru(input=my_tensor)
# (tensor([[[-0.0124, 0.7643, 0.4491],
# [0.6353, -0.2613, -0.5308],
# [0.0659, 0.1136, 0.3575]],
# [[-0.1180, 0.5145, 0.7068],
# [0.5407, 0.6379, 0.3379],
# [0.4567, -0.0781, -0.1387]]], grad_fn=<StackBackward0>),
# tensor([[[-0.1180, 0.5145, 0.7068],
# [0.5407, 0.6379, 0.3379],
# [0.4567, -0.0781, -0.1387]]], grad_fn=<StackBackward0>))
my_tensor = torch.tensor([[[8.+0.j], [-3.+0.j], [0.+0.j]],
[[1.+0.j], [5.+0.j], [-2.+0.j]]])
torch.manual_seed(42)
gru = nn.GRU(input_size=1, hidden_size=3, dtype=torch.complex64)
gru(input=my_tensor)
# (tensor([[[1.2749-0.0514j, -0.0101-0.0269j, 0.0132-0.0126j],
# [-0.1140-0.1026j, -0.6211-0.2240j, -0.7910-0.0693j],
# [0.0631-0.1479j, 0.4467-0.2914j, -0.2153+0.0264j]],
# [[1.1012+0.1127j, 0.4283-0.0588j, 0.0385-0.0261j],
# [1.0742-0.5264j, -1.0538-0.2916j, -0.7181-0.0920j],
# [-0.0774-0.2241j, -0.2958-0.4666j, -0.7759-0.0523j]]],
# grad_fn=<StackBackward0>),
# tensor([[[1.1012+0.1127j, 0.4283-0.0588j, 0.0385-0.0261j],
# [1.0742-0.5264j, -1.0538-0.2916j, -0.7181-0.0920j],
# [-0.0774-0.2241j, -0.2958-0.4666j, -0.7759-0.0523j]]],
# grad_fn=<StackBackward0>))
Top comments (0)