DEV Community

4rldur0
4rldur0

Posted on

댑덥딥 4주차 정리

'모두를 위한 딥러닝 시즌 2' 강의를 듣고 공부하는 스터디 입니다. https://deeplearningzerotoall.github.io/season2/lec_tensorflow.html


비대면 3 May, 2023

08-1 Perceptron

Perceptron

-인공신경망의 한 종류

인공신경망? Neuron의 동작을 본 따 만든 모델

-x*weight의 합 + bias를 input으로 함. activation fuction(ex. sigmoid)을 거쳐 output을 만듦

-AND와 OR 문제를 해결하기 위해 만들어짐→XOR 문제는 mutilayer가 필요/linear한 classifier로는 불가능

Image description

-XOR 문제 해결(단층)

Image description

08-2 Multi Layer Perceptron

Multilayer Perceptron

-XOR 문제 해결(multilayer)

Image description

-학습할 수 있는 방법이 없었음→backpropagation 알고리즘 개발을 통해 해결

#layer 2개
X = torch.FloatTensor([[0, 0], [0, 1], [1, 0], [1, 1]]).to(device)
Y = torch.FloatTensor([[0], [1], [1], [0]]).to(device)
# nn layers
linear1 = torch.nn.Linear(2, 2, bias=True)
linear2 = torch.nn.Linear(2, 1, bias=True)
sigmoid = torch.nn.Sigmoid()
model = torch.nn.Sequential(linear1, sigmoid, linear2, sigmoid).to(device)
# define cost/loss & optimizer
criterion = torch.nn.BCELoss().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=1)
for step in range(10001):
    optimizer.zero_grad()
    hypothesis = model(X)
# cost/loss function
    cost = criterion(hypothesis, Y)
    cost.backward()
    optimizer.step()
    if step % 100 == 0:
        print(step, cost.item())
Enter fullscreen mode Exit fullscreen mode
#layer 4개
X = torch.FloatTensor([[0, 0], [0, 1], [1, 0], [1, 1]]).to(device)
Y = torch.FloatTensor([[0], [1], [1], [0]]).to(device)
# nn layers
linear1 = torch.nn.Linear(2, 10, bias=True)
linear2 = torch.nn.Linear(10, 10, bias=True)
linear3 = torch.nn.Linear(10, 10, bias=True)
linear4 = torch.nn.Linear(10, 1, bias=True)
sigmoid
...
Enter fullscreen mode Exit fullscreen mode

backpropagation

-출력값과 예측값을 비교→w 업데이트

Image description

X = torch.FloatTensor([[0, 0], [0, 1], [1, 0], [1, 1]]).to(device)
Y = torch.FloatTensor([[0], [1], [1], [0]]).to(device)
# nn layers
w1 = torch.Tensor(2, 2).to(device)
b1 = torch.Tensor(2).to(device)
w2 = torch.Tensor(2, 1).to(device)
b2 = torch.Tensor(1).to(device)
def sigmoid(x):
# sigmoid function
    return 1.0 / (1.0 + torch.exp(-x))
# return torch.div(torch.tensor(1), torch.add(torch.tensor(1.0), torch.exp(-x)))
def sigmoid_prime(x):
# derivative of the sigmoid function
    return sigmoid(x) * (1 - sigmoid(x))

for step in range(10001):
# forward
    l1 = torch.add(torch.matmul(X, w1), b1)
    a1 = sigmoid(l1)
    l2 = torch.add(torch.matmul(a1, w2), b2)
    Y_pred = sigmoid(l2)
    cost = -torch.mean(Y * torch.log(Y_pred) + (1 - Y) * torch.log(1 - Y_pred))
# Back prop (chain rule)
for step in range(10001):
# Loss derivative
    d_Y_pred = (Y_pred - Y) / (Y_pred * (1.0 - Y_pred) + 1e-7)
# Layer 2
    d_l2 = d_Y_pred * sigmoid_prime(l2)
    d_b2 = d_l2
    d_w2 = torch.matmul(torch.transpose(a1, 0, 1), d_b2)
# Layer 1
    d_a1 = torch.matmul(d_b2, torch.transpose(w2, 0, 1))
    d_l1 = d_a1 * sigmoid_prime(l1)
    d_b1 = d_l1
    d_w1 = torch.matmul(torch.transpose(X, 0, 1), d_b1)
for step in range(10001):
# Weight update
    w1 = w1 - learning_rate * d_w1
    b1 = b1 - learning_rate * torch.mean(d_b1, 0)
    w2 = w2 - learning_rate * d_w2
    b2 = b2 - learning_rate * torch.mean(d_b2, 0)
    if step % 100 == 0:
        print(step, cost.item())
Enter fullscreen mode Exit fullscreen mode

09-1 ReLU

problem of sigmoid: gradient를 구할 때 문제 발생(vanishing gradient)-양끝에서는 gradient가 0에 가까움

→ReLU? f(x) = max(0,x)

:양수면 자기 자신, 음수면 0을 출력

torch.nn.sigmoid(x)
torch.nn.tanh(x)
torch.nn.relu(x)
torch.nn.leaky_relu(x, 0.01)    #음수에서 f(x)=0이 되는 문제 해결
Enter fullscreen mode Exit fullscreen mode

Optimizer in PyTorch

torch.optim.SGD
torch.optim.Adadelta
torch.optim.Adagrad
torch.optim.Adam
torch.optim.SparseAdam
torch.optim.Adamax
torch.optim.ASGD
torch.optim.LBFGS
torch.optim.RMSprop
torch.optim.Rprop
Enter fullscreen mode Exit fullscreen mode

Image description

09-2 Weight initialization

  1. 상수로 초기화
  2. RBM/DBM
  3. Xavier/He

RBM/DBM

RBM(Restricted Boltzmann Machine): 하나의 layer 안에서는 연결 없음. 다른 layer와는 모두 연결됨

-Pre-training을 통해 실현 X→Y Y→X’

(a) X1→Y2 Y2→X1’

(b) 첫번째 layer의 w를 고정한 후 두번째 layer에 대해 반복 X2→Y3 Y3→X2’

(c) fine-tuning(=RBM적용된 weight에 대해 backpropagation)

-요즘에는 잘 사용하지 않음

Xavier/He

-특성에 따라 초기화를 달리함. 수식에 대입하는 간단한 방법

-Nin: layer의 input 수, Nout: layer의 output 수

Xavier Normal initialization

Image description

Xavier Uniform initialization

Image description

He initialization: Xavier initialization에서 Nout이 빠진 버전

He Normal initialization

Image description

He Uniform initialization

Image description

# nn layers
linear1 = torch.nn.Linear(784, 256, bias=True)
linear2 = torch.nn.Linear(256, 256, bias=True)
linear3 = torch.nn.Linear(256, 10, bias=True)
relu = torch.nn.ReLU()
# xavier initialization
torch.nn.init.xavier_uniform_(linear1.weight)
torch.nn.init.xavier_uniform_(linear2.weight)
torch.nn.init.xavier_uniform_(linear3.weight)
Enter fullscreen mode Exit fullscreen mode

09-3 Dropout

-overfitting을 해결하기 위한 방법

Image description

-한 layer에서 사전에 설정한 확률(~=비율)에 따라 무작위로 택한 일부분의 node만을 활용함

-매번 다른 형태의 network를 만듦

dropout = torch.nn.Dropout(p=drop_prob)
Enter fullscreen mode Exit fullscreen mode

*test 할 때는 dropout을 사용하지 않음.

model.eval()    #=dropout = False
Enter fullscreen mode Exit fullscreen mode

Q. backpropagation = chain rule ?


대면 6 May, 2023
Q. loss 값이 nan이 나오는 이유

Image description

perceptron

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay