DEV Community

4rldur0
4rldur0

Posted on

댑덥딥 3주차 정리

'모두를 위한 딥러닝 시즌 2' 강의를 듣고 공부하는 스터디 입니다. https://deeplearningzerotoall.github.io/season2/lec_tensorflow.html


비대면 19 April, 2023

07-1 Tips

maximum likelihood estimation

likelihood: 가능도

MLE: f(θ)가 최대가 되는 θ(observation을 가장 잘 설명하는 θ)를 찾아내는 과정

ex)베르누이 분포를 따를 때, f(θ)=

Image description

(n과 k는 observation으로 얻어짐)

optimization via gradient descent

f(θ)의 최대를 찾을 때 활용

θ←θ-ɑ*▽L(x;θ)

overfitting and regrularization

MLE는 숙명적으로 overfitting이 따름

overfitting: 주어진 데이터에 대해 과도하게 fitting된 상태

Image description

-원하는 fitting: 파란색 선

  • overfitting을 줄이는 방법

1)more data

2)less features

3)regularization

  • regulatization의 종류

1)early stopping: validation Loss가 더이상 낮아지지 않을 때

2)reducing network size

3)weight decay

4)dropout⭐

5)batch normalization⭐

-2)~5) 딥러닝에서 사용

training and test dataset

:overfitting을 최소화하는 방법 중 하나

Image description

dev set(validation set)을 통해 training set이 overfitting되었는지 검증(optional)→test set으로 확인

Image description

Basic Approach to Train DNN

①make a neural network architecture

②train and check that model is over-fitted

  1. if it is not, increase the model size(deeper and wider)
  2. if it is, add regularization, such as drop-out, batch-normalization

③repeat from step-2

learning rate

learning rate가 너무 크면 cost가 너무 커진다(발산한다)

learning rate가 너무 작으면 cost가 거의 줄어들지 않는다

data preprocessing

1)standardization: 정규분포화

mu=x_train.mean(dim=0)
sigma = x_train.std(dim=0)
norm_x_train = (x_train - mu) / sigm
Enter fullscreen mode Exit fullscreen mode

전처리를 안 했다면? y_train의 column 간 데이터의 크기 차이가 크면 크기가 작은 쪽은 거의 무시됨

07-2 MNIST

MNIST: handwritten digits dataset(training set(60,000 장)+test set)

Image description

-size: 28x28

-1 channel gray image

-0~9 digits

torchvision

: 유명 데이터셋, 모델 아키텍쳐, transform으로 구성된 패키지


import torchvision.datasets as dsets

mnist_train = dsets.MNIST(root="MNIST_data/", train = True, transform=transforms.ToTensor(), download=True)
mnist_test = dsets.MNIST(root="MNIST_data/", train = False, transform=transforms.ToTensor(), download=True)
Enter fullscreen mode Exit fullscreen mode

-pytorch image: channel height wide 순서 v.s. 일반적인 image: height wide channel 순서 → .ToTensor() 활용

② torch.utils.DataLoader를 활용해 data 불러옴

댑덥딥 2주차 정리

③ size: 28x28 →view()를 이용해 784로 바꿔줌

for epoch in range(training_epochs):
    for X, Y in data_loader:
        ...
        X = X.view(-1, 28 * 28).to(device)
        ...
Enter fullscreen mode Exit fullscreen mode
  • full code
##Train
# MNIST data image of shape 28 * 28 = 784 Softmax
linear = torch.nn.Linear(784, 10, bias=True).to(device)
# initialization
torch.nn.init.normal_(linear.weight)
# parameters
training_epochs = 15
batch_size = 100
# define cost/loss & optimizer
criterion = torch.nn.CrossEntropyLoss().to(device) # Softmax is internally computed.
optimizer = torch.optim.SGD(linear.parameters(), lr=0.1)
for epoch in range(training_epochs):
avg_cost = 0
total_batch = len(data_loader)
for X, Y in data_loader:
# reshape input image into [batch_size by 784]
# label is not one-hot encoded
X = X.view(-1, 28 * 28).to(device)
optimier.zero_grad()
hypothesis = linear(X)
cost = criterion(hypothesis, Y)
cost.backward()
avg_cost += cost / total_batch
print("Epoch: ", "%04d" % (epoch+1), "cost =", "{:.9f}".format(avg_cost))

##Test
# Test the model using test sets
With torch.no_grad():
X_test = mnist_test.test_data.view(-1, 28 * 28).float().to(device)
Y_test = mnist_test.test_labels.to(device)
prediction = linear(X_test)
correct_prediction = torch.argmax(prediction, 1) == Y_test
accuracy = correct_prediction.float().mean()
print("Accuracy: ", accuracy.item())

##Visualization
import matplotlib.pyplot as plt
import random
...
r = random.randint(0, len(mnist_test) - 1)
X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 *
28).float().to(device)
Y_single_data = mnist_test.test_labels[r:r + 1].to(device)
print("Label: ", Y_single_data.item())
single_prediction = linear(X_single_data)
print("Prediction: ", torch.argmax(single_prediction,
1).item())
plt.imshow(mnist_test.test_data[r:r + 1].view(28, 28),
cmap="Greys", interpolation="nearest")
plt.show()
Enter fullscreen mode Exit fullscreen mode

대면 22 April, 2023

tips and torchvision

AWS Security LIVE!

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay