문장을 입력하였을 때 자동적으로 학습하는 코딩하기(by Hyper parameter)
import tensorflow as tf
import random
import numpy as np
from tensorflow.contrib import rnn
import pprint
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idex
# hyper parameters
dic_size = len(char2idx) # RNN input size (one hot size)
hidden_size = len(char2idx) # RNN output size
num_classes = len(char2idx) # final output size (RNN or softmax, etc.)
batch_size = 1 # one sample data, one batch
sequence_length = len(sample) - 1 # number of lstm rollings (unit #)
learning_rate = 0.1
sample_idx = [char2idx[c] for c in sample] # char to index
x_data = [sample_idx[:-1]] # X data sample (0 ~ n-1) hello: hell
y_data = [sample_idx[1:]] # Y label sample (1 ~ n) hello: ello
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
x_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
cell = tf.contrib.rnn.BasicLSTMCell(
num_units=hidden_size, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
outputs, _states = tf.nn.dynamic_rnn(
cell, x_one_hot, initial_state=initial_state, dtype=tf.float32)
# FC layer
X_for_fc = tf.reshape(outputs, [-1, hidden_size])
outputs = tf.contrib.layers.fully_connected(X_for_fc, num_classes, activation_fn=None)
# reshape out for sequence_loss
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
prediction = tf.argmax(outputs, axis=2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(50):
l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_data})
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print(i, "loss:", l, "Prediction:", ''.join(result_str))- 그러나.... 더욱더 긴 글을 사용하여 학습을 시키게 되면 문제가 발생하게 된다!!!!
Stacked RNN + Softmax layer
- 해결방법1 -> 이 모델의 경우 RNN을 한계층으로 쌓았지만 여러 계층으로 쌓는다.
- 해결방법2 -> softmax를 사용하면 보다 좋은 결과를 얻을 수 있다.
(CNN에서와 같이 마지막에 softmax를 사용하면 더 좋은 결과를 얻을 수 있다.) - Data 입력부분은 그대로 쓰면된다.
'IT > 머신러닝' 카테고리의 다른 글
| [Keras] 기본 예제 (0) | 2018.08.31 |
|---|---|
| [section_12_lab] Dynamic RNN & RNN with Time Series Data (0) | 2018.06.01 |
| [section_12_lab] Hi Hello Learning (sequence 학습) (0) | 2018.06.01 |
| [section_12_lab] RNN – Basic Input Output (0) | 2018.06.01 |
| [section_11_lab] Class, tf.layers, Ensemble (MNIST 99.5%) (0) | 2018.06.01 |