'모두를 위한 딥러닝 시즌 2' 강의를 듣고 공부하는 스터디 입니다. https://deeplearningzerotoall.github.io/season2/lec_tensorflow.html
비대면 19 April, 2023
07-1 Tips
maximum likelihood estimation
likelihood: 가능도
MLE: f(θ)가 최대가 되는 θ(observation을 가장 잘 설명하는 θ)를 찾아내는 과정
ex)베르누이 분포를 따를 때, f(θ)=
(n과 k는 observation으로 얻어짐)
optimization via gradient descent
f(θ)의 최대를 찾을 때 활용
θ←θ-ɑ*▽L(x;θ)
overfitting and regrularization
MLE는 숙명적으로 overfitting이 따름
overfitting: 주어진 데이터에 대해 과도하게 fitting된 상태
-원하는 fitting: 파란색 선
- overfitting을 줄이는 방법
1)more data
2)less features
3)regularization
- regulatization의 종류
1)early stopping: validation Loss가 더이상 낮아지지 않을 때
2)reducing network size
3)weight decay
4)dropout⭐
5)batch normalization⭐
-2)~5) 딥러닝에서 사용
training and test dataset
:overfitting을 최소화하는 방법 중 하나
dev set(validation set)을 통해 training set이 overfitting되었는지 검증(optional)→test set으로 확인
Basic Approach to Train DNN
①make a neural network architecture
②train and check that model is over-fitted
- if it is not, increase the model size(deeper and wider)
- if it is, add regularization, such as drop-out, batch-normalization
③repeat from step-2
learning rate
learning rate가 너무 크면 cost가 너무 커진다(발산한다)
learning rate가 너무 작으면 cost가 거의 줄어들지 않는다
data preprocessing
1)standardization: 정규분포화
mu=x_train.mean(dim=0)
sigma = x_train.std(dim=0)
norm_x_train = (x_train - mu) / sigm
전처리를 안 했다면? y_train의 column 간 데이터의 크기 차이가 크면 크기가 작은 쪽은 거의 무시됨
07-2 MNIST
MNIST: handwritten digits dataset(training set(60,000 장)+test set)
-size: 28x28
-1 channel gray image
-0~9 digits
torchvision
: 유명 데이터셋, 모델 아키텍쳐, transform으로 구성된 패키지
①
import torchvision.datasets as dsets
mnist_train = dsets.MNIST(root="MNIST_data/", train = True, transform=transforms.ToTensor(), download=True)
mnist_test = dsets.MNIST(root="MNIST_data/", train = False, transform=transforms.ToTensor(), download=True)
-pytorch image: channel height wide 순서 v.s. 일반적인 image: height wide channel 순서 → .ToTensor() 활용
② torch.utils.DataLoader를 활용해 data 불러옴
③ size: 28x28 →view()를 이용해 784로 바꿔줌
for epoch in range(training_epochs):
for X, Y in data_loader:
...
X = X.view(-1, 28 * 28).to(device)
...
- full code
##Train
# MNIST data image of shape 28 * 28 = 784 Softmax
linear = torch.nn.Linear(784, 10, bias=True).to(device)
# initialization
torch.nn.init.normal_(linear.weight)
# parameters
training_epochs = 15
batch_size = 100
# define cost/loss & optimizer
criterion = torch.nn.CrossEntropyLoss().to(device) # Softmax is internally computed.
optimizer = torch.optim.SGD(linear.parameters(), lr=0.1)
for epoch in range(training_epochs):
avg_cost = 0
total_batch = len(data_loader)
for X, Y in data_loader:
# reshape input image into [batch_size by 784]
# label is not one-hot encoded
X = X.view(-1, 28 * 28).to(device)
optimier.zero_grad()
hypothesis = linear(X)
cost = criterion(hypothesis, Y)
cost.backward()
avg_cost += cost / total_batch
print("Epoch: ", "%04d" % (epoch+1), "cost =", "{:.9f}".format(avg_cost))
##Test
# Test the model using test sets
With torch.no_grad():
X_test = mnist_test.test_data.view(-1, 28 * 28).float().to(device)
Y_test = mnist_test.test_labels.to(device)
prediction = linear(X_test)
correct_prediction = torch.argmax(prediction, 1) == Y_test
accuracy = correct_prediction.float().mean()
print("Accuracy: ", accuracy.item())
##Visualization
import matplotlib.pyplot as plt
import random
...
r = random.randint(0, len(mnist_test) - 1)
X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 *
28).float().to(device)
Y_single_data = mnist_test.test_labels[r:r + 1].to(device)
print("Label: ", Y_single_data.item())
single_prediction = linear(X_single_data)
print("Prediction: ", torch.argmax(single_prediction,
1).item())
plt.imshow(mnist_test.test_data[r:r + 1].view(28, 28),
cmap="Greys", interpolation="nearest")
plt.show()
대면 22 April, 2023
Top comments (0)