DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on

BCEWithLogitsLoss in PyTorch

Buy Me a Coffee

*Memos:

BCEWithLogitsLoss() can get the 0D or more D tensor of the zero or more values(float) computed by BCE Loss and Sigmoid from the 0D or more D tensor of zero or more elements as shown below:

*Memos:

  • The 1st argument for initialization is weight(Optional-Default:None-Type:tensor of int, float or bool): *Memos:
    • If it's not given, it's 1.
    • It must be the 0D or more D tensor of zero or more elements.
  • There is reduction argument for initialization(Optional-Default:'mean'-Type:str). *'none', 'mean' or 'sum' can be selected.
  • There is pos_weight argument for initialization(Optional-Default:None-Type:tensor of int or float): *Memos:
    • If it's not given, it's 1.
    • It must be the 0D or more D tensor of zero or more elements.
  • There are size_average and reduce argument for initialization but they are deprecated.
  • The 1st argument is input(Required-Type:tensor of float). *It must be the 0D or more D tensor of zero or more elements.
  • The 2nd argument is target(Required-Type:tensor of float). *It must be the 0D or more D tensor of zero or more elements.
  • input and target must be the same size otherwise there is error.
  • The empty 1D or more D input and target tensor with reduction='mean' return nan.
  • The empty 1D or more D input and target tensor with reduction='sum' return 0..
  • BCEWithLogitsLoss() is the combination of Sigmoid and BCE Loss. Image description
import torch
from torch import nn

tensor1 = torch.tensor([ 8., -3., 0.,  1.,  5., -2.])
tensor2 = torch.tensor([-3.,  7., 4., -2., -9.,  6.])
       # -w*(p*y*log(1/(1+exp(-x))+(1-y)*log(1-(1/1+exp(-x))))
       # -1*(1*(-3)*log(1/(1+exp(-8)))+(1-(-3))*log(1-(1/(1+exp(-8)))))
       # ↓↓↓↓↓↓↓
       # 32.0003 + 21.0486 + 0.6931 + 3.3133 + 50.0067 + 50.0067 = 82.8423
       # 119.1890 / 6 = 19.8648
bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)

bcelogits
# BCEWithLogitsLoss()

print(bcelogits.weight)
# None

bcelogits.reduction
# 'mean'

bcelogits = nn.BCEWithLogitsLoss(weight=None,
                                 reduction='mean',
                                 pos_weight=None)
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)

bcelogits = nn.BCEWithLogitsLoss(reduction='sum')
bcelogits(input=tensor1, target=tensor2)
# tensor(119.1890)

bcelogits = nn.BCEWithLogitsLoss(reduction='none')
bcelogits(input=tensor1, target=tensor2)
# tensor([32.0003, 21.0486, 0.6931, 3.3133, 50.0067, 12.1269])

bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([0., 1., 2., 3., 4., 5.]))
bcelogits(input=tensor1, target=tensor2)
# tensor(48.8394)

bcelogits = nn.BCEWithLogitsLoss(
                pos_weight=torch.tensor([0., 1., 2., 3., 4., 5.])
            )
bcelogits(input=tensor1, target=tensor2)
# tensor(28.5957)

bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor(0.))
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)

bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(0.))
bcelogits(input=tensor1, target=tensor2)
# tensor(13.8338)

bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([0, 1, 2, 3, 4, 5]))
bcelogits(input=tensor1, target=tensor2)
# tensor(48.8394)

bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([0, 1, 2, 3, 4, 5]))
bcelogits(input=tensor1, target=tensor2)
# tensor(28.5957)

bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor(0))
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)

bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(0))
bcelogits(input=tensor1, target=tensor2)
# tensor(13.8338)

bcelogits = nn.BCEWithLogitsLoss(
              weight=torch.tensor([True, False, True, False, True, False])
          )
bcelogits(input=tensor1, target=tensor2)
# tensor(13.7834)

bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([False]))
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)

tensor1 = torch.tensor([[8., -3., 0.], [1., 5., -2.]])
tensor2 = torch.tensor([[-3., 7., 4.], [-2., -9., 6.]])

bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)

tensor1 = torch.tensor([[[8.], [-3.], [0.]], [[1.], [5.], [-2.]]])
tensor2 = torch.tensor([[[-3.], [7.], [4.]], [[-2.], [-9.], [6.]]])

bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)

tensor1 = torch.tensor([])
tensor2 = torch.tensor([])

bcelogits = nn.BCEWithLogitsLoss(reduction='mean')
bcelogits(input=tensor1, target=tensor2)
# tensor(nan)

bcelogits = nn.BCEWithLogitsLoss(reduction='sum')
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)
Enter fullscreen mode Exit fullscreen mode

BCEWithLogitsLoss() and BCELoss() with Sigmoid() or torch.sigmoid() for input argument can get the same results as shown below. *BCEWithLogitsLoss() can accept the values out of 0<=y<=1 for input and target argument while BCELoss() cannot:

import torch
from torch import nn

tensor1 = torch.tensor([0.4, 0.8, 0.6, 0.3, 0.0, 0.5])
tensor2 = torch.tensor([0.2, 0.9, 0.4, 0.1, 0.8, 0.5])

bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(0.7205)

sigmoid = nn.Sigmoid()

bceloss = nn.BCELoss()
bceloss(input=sigmoid(input=tensor1), target=tensor2)
# tensor(0.7205)

bceloss = nn.BCELoss()
bceloss(input=torch.sigmoid(input=tensor1), target=tensor2)
# tensor(0.7205)
Enter fullscreen mode Exit fullscreen mode

Image of Datadog

How to Diagram Your Cloud Architecture

Cloud architecture diagrams provide critical visibility into the resources in your environment and how they’re connected. In our latest eBook, AWS Solution Architects Jason Mimick and James Wenzel walk through best practices on how to build effective and professional diagrams.

Download the Free eBook

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →