DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

Adam in PyTorch

Buy Me a Coffee

*Memos:

Adam() can do gradient descent by Momentum and RMSProp as shown below:

*Memos:

  • The 1st argument for initialization is params(Required-Type:generator).
  • The 2nd argument for initialization is lr(Optional-Default:0.01-Type:int or float). *It must be 0 <= x.
  • The 3rd argument for initialization is betas(Optional-Default:(0.9, 0.999)-Type:tuple or list of int or float). *It must be 0 <= x < 1.
  • The 4th argument for initialization is eps(Optional-Default:1e-08-Type:int or float). *It must be 0 <= x.
  • The 5th argument for initialization is weight_decay(Optional-Default:0-Type:int or float). *It must be 0 <= x.
  • The 6th argument for initialization is amsgrad(Optional-Default:False-Type:bool). *If it's True, AMSGrad is used.
  • There is foreach argument for initialization(Optional-Default:None-Type:bool). *foreach= must be used.
  • There is maximize argument for initialization(Optional-Default:False-Type:bool). *maximize= must be used.
  • There is capturable argument for initialization(Optional-Default:False-Type:bool). *capturable= must be used.
  • There is differentiable argument for initialization(Optional-Default:False-Type:bool). *differentiable= must be used.
  • There is fused argument for initialization(Optional-Default:None-Type:bool): *Memos:
    • If it's True, all the parameters must be the float tensors of cuda, xpu or privateuseone.
    • fused= must be used.
  • Both foreach and fused cannot be True.
  • Both differentiable and fused cannot be True.
  • step() can update parameters.
  • zero_grad() can reset gradients.
from torch import nn
from torch import optim

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_layer = nn.Linear(in_features=4, out_features=5)

    def forward(self, x):
        return self.linear_layer(x)

mymodel = MyModel()

adam = optim.Adam(params=mymodel.parameters())
adam
# Adam (
# Parameter Group 0
#     amsgrad: False
#     betas: (0.9, 0.999)
#     capturable: False
#     differentiable: False
#     eps: 1e-08
#     foreach: None
#     fused: None
#     lr: 0.001
#     maximize: False
#     weight_decay: 0
# )

adam.state_dict()
# {'state': {},
#  'param_groups': [{'lr': 0.001,
#    'betas': (0.9, 0.999),
#    'eps': 1e-08,
#    'weight_decay': 0,
#    'amsgrad': False,
#    'maximize': False,
#    'foreach': None,
#    'capturable': False,
#    'differentiable': False,
#    'fused': None,
#    'params': [0, 1]}]}

adam.step()
adam.zero_grad()
# None

adam = optim.Adam(params=mymodel.parameters(), lr=0.001,
                  betas=(0.9, 0.999), eps=1e-08, weight_decay=0,
                  amsgrad=False, foreach=None, maximize=False,
                  capturable=False, differentiable=False, fused=None)
adam
# Adam (
# Parameter Group 0
#     amsgrad: False
#     betas: (0.9, 0.999)
#     capturable: False
#     differentiable: False
#     eps: 1e-08
#     foreach: None
#     fused: None
#     lr: 0.001
#     maximize: False
#     weight_decay: 0
# )
Enter fullscreen mode Exit fullscreen mode

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more