说明：这篇文章来自 2020-2021 年的 WordPress 备份。原始图片附件未随 XML 一起保存，旧服务器图片地址也已失效，因此这里保留正文并移除了失效图片。

循环神经网络

循环核

参数时间共享，循环层提取时间信息。结构：

前向传播时：记忆体内存储的状态信息ht ，在每个时刻都被刷新，三个参数矩阵$w_xh, w_hh, w_hy$自始至终都是固定不变的。
反向传播时：三个参数矩阵$w_xh, w_hh, w_hy$被梯度下降法更新。

循环核按时间步展开

按时间步展开,就是把循环核按照时间轴方向展开，每个时刻记忆体状态信息ht被刷新，记忆体周围的参数矩阵$w_xh, w_hh, w_hy$是固定不变的，训练优化的就是这些参数矩阵，训练完成后使用效果最好的参数矩阵，执行前向传播，输出预测结果。
循环神经网络：借助循环核提取时间特征后，送入全连接网络。

循环计算层

每个循环核构成一层循环计算层，循环计算层的层数是向输出方向增长的。每个循环核中记忆体的个数，可以根据需求任意指定。

TF描述计算层
```
tf.keras.layers.SimpleRNN(记忆体个数，activation=‘激活函数’ ，
return_sequences=是否每个时刻输出ht到下一层)
```
activation=‘激活函数’ （不写，默认使用tanh） return_sequences=True 各时间步输出ht return_sequences=False 仅最后时间步输出ht（默认）一般最后一层的循环核用False,仅最后一个时间步输出ht，中间层用True，每个时间步都把ht输出给下一层例：<code>SimpleRNN(3, return_sequences=True)</code> API对送入循环层的数据维度有要求，要求送入循环层的数据是三维的。
入RNN时， x_train维度： [送入样本数，循环核时间展开步数，每个时间步输入特征个数] 例如：

循环计算过程1

ABCDE字母预测（输入一个字母预测下一个字母）

字母预测：输入a预测出b，输入b预测出c，输入c预测出d，输入d预测出e，输入e预测出a 用独热码对字母进行编码，随机生成$w_xh, w_hh, w_hy$三个参数矩阵，记忆体个数选取3,最开始时记忆体信息等于0：h(t-1) = [0.0, 0.0, 0.0]，ht.yt公式如上具体计算过程如图：
可以看到模型认为有91%的可能性下一个字母是c 代码如下：

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dense, SimpleRNN
import matplotlib.pyplot as plt
import os
input_word = “abcde”
w_to_id = {‘a’: 0, ‘b’: 1, ‘c’: 2, ‘d’: 3, ‘e’: 4}  # 单词映射到数值id的词典
id_to_onehot = {0: [1., 0., 0., 0., 0.], 1: [0., 1., 0., 0., 0.], 2: [0., 0., 1., 0., 0.], 3: [0., 0., 0., 1., 0.],
4: [0., 0., 0., 0., 1.]}  # id编码为one-hot
x_train = [id_to_onehot[w_to_id[‘a’]], id_to_onehot[w_to_id[‘b’]], id_to_onehot[w_to_id[‘c’]],
id_to_onehot[w_to_id[‘d’]], id_to_onehot[w_to_id[‘e’]]]
y_train = [w_to_id[‘b’], w_to_id[‘c’], w_to_id[‘d’], w_to_id[‘e’], w_to_id[‘a’]]
np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)
使x_train符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。
此处整个数据集送入，送入样本数为len(x_train)；输入1个字母出结果，循环核时间展开步数为1; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
x_train = np.reshape(x_train, (len(x_train), 1, 5))
y_train = np.array(y_train)
model = tf.keras.Sequential([
SimpleRNN(3),    #搭建具有3个记忆体的循环层。记忆体个数可自定义，记忆体个数越多，记忆力越好，占用资源更多
Dense(5, activation=‘softmax’)   #全连接层 实现了y_t的计算
])
model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=[‘sparse_categorical_accuracy’])
checkpoint_save_path = ”./checkpoint/rnn_onehot_1pre1.ckpt”
if os.path.exists(checkpoint_save_path + ‘.index’):
print(‘-------------load the model-----------------’)
model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True,
monitor=‘loss’)  # 由于fit没有给出测试集，不计算测试集准确率，根据loss，保存最优模型
history = model.fit(x_train, y_train, batch_size=32, epochs=100, callbacks=[cp_callback])
model.summary()
print(model.trainable_variables)
file = open(’./weights.txt’, ‘w’)  # 参数提取
for v in model.trainable_variables:
file.write(str(v.name) + ‘\n’)
file.write(str(v.shape) + ‘\n’)
file.write(str(v.numpy()) + ‘\n’)
file.close()
###############################################    show   ###############################################
显示训练集和验证集的acc和loss曲线
acc = history.history[‘sparse_categorical_accuracy’]
loss = history.history[‘loss’]
plt.subplot(1, 2, 1)
plt.plot(acc, label=‘Training Accuracy’)
plt.title(‘Training Accuracy’)
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(loss, label=‘Training Loss’)
plt.title(‘Training Loss’)
plt.legend()
plt.show()
############### predict #############
preNum = int(input(“input the number of test alphabet:”))
for i in range(preNum):
alphabet1 = input(“input test alphabet:”)
alphabet = [id_to_onehot[w_to_id[alphabet1]]]
# 使alphabet符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。此处验证效果送入了1个样本，送入样本数为1；输入1个字母出结果，所以循环核时间展开步数为1; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
alphabet = np.reshape(alphabet, (1, 1, 5))
result = model.predict([alphabet])
pred = tf.argmax(result, axis=1)
pred = int(pred)
tf.print(alphabet1 + ’->’ + input_word[pred])

ABCDE字母预测（输入几个字母预测下一个字母）

把时间核按时间步展开，连续输入几个字母预测下一个字母连续输入4个字母预测下一个字母具体计算过程如图：

实现代码：

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dense, SimpleRNN
import matplotlib.pyplot as plt
import os

input_word = "abcde"
w_to_id = {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}  # 单词映射到数值id的词典
id_to_onehot = {0: [1., 0., 0., 0., 0.], 1: [0., 1., 0., 0., 0.], 2: [0., 0., 1., 0., 0.], 3: [0., 0., 0., 1., 0.],
                4: [0., 0., 0., 0., 1.]}  # id编码为one-hot

x_train = [
    [id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']], id_to_onehot[w_to_id['d']]],
    [id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']], id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']]],
    [id_to_onehot[w_to_id['c']], id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']], id_to_onehot[w_to_id['a']]],
    [id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']], id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']]],
    [id_to_onehot[w_to_id['e']], id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']]],
]
y_train = [w_to_id['e'], w_to_id['a'], w_to_id['b'], w_to_id['c'], w_to_id['d']]

np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)

# 使x_train符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。
# 此处整个数据集送入，送入样本数为len(x_train)；输入4个字母出结果，循环核时间展开步数为4; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
x_train = np.reshape(x_train, (len(x_train), 4, 5))
y_train = np.array(y_train)

model = tf.keras.Sequential([
    SimpleRNN(3),
    Dense(5, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/rnn_onehot_4pre1.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='loss')  # 由于fit没有给出测试集，不计算测试集准确率，根据loss，保存最优模型

history = model.fit(x_train, y_train, batch_size=32, epochs=100, callbacks=[cp_callback])

model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

############### predict #############

preNum = int(input("input the number of test alphabet:"))
for i in range(preNum):
    alphabet1 = input("input test alphabet:")
    alphabet = [id_to_onehot[w_to_id[a]] for a in alphabet1]
    # 使alphabet符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。此处验证效果送入了1个样本，送入样本数为1；输入4个字母出结果，所以循环核时间展开步数为4; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
    alphabet = np.reshape(alphabet, (1, 4, 5))
    result = model.predict([alphabet])
    pred = tf.argmax(result, axis=1)
    pred = int(pred)
    tf.print(alphabet1 + '->' + input_word[pred])

Embedding

独热码：数据量大过于稀疏，映射之间是独立的，没有表现出关联性
Embedding：是一种单词编码方法，用低维向量实现了编码，这种编码通过神经网络训练优化，能表达出单词间的相关性。

TF中Embedding实现编码的函数：

tf.keras.layers.Embedding(词汇表大小，编码维度) 编码维度就是用几个数字表达一个单词例：对1-100进行编码， [4] 编码为 [0.25, 0.1, 0.11] tf.keras.layers.Embedding(100, 3 )

Embedding层对输入数据的维度要求

入Embedding时， x_train维度(二维)： [送入样本数，循环核时间展开步数]

上面字母预测代码中的独热码改为Embedding：

# 使x_train符合Embedding输入要求：[送入样本数， 循环核时间展开步数] ，
# 此处整个数据集送入所以送入，送入样本数为len(x_train)；输入4个字母出结果，循环核时间展开步数为4。
x_train = np.reshape(x_train, (len(x_train), 4))
y_train = np.array(y_train)
model = tf.keras.Sequential([
Embedding(26, 2),
SimpleRNN(10),
Dense(26, activation=‘softmax’)
])

tensorflow_6 循环神经网络

循环神经网络

循环核

循环核按时间步展开

循环计算层

TF描述计算层

循环计算过程1

ABCDE字母预测（输入一个字母预测下一个字母）

使x_train符合SimpleRNN输入要求：[送入样本数，循环核时间展开步数，每个时间步输入特征个数]。

此处整个数据集送入，送入样本数为len(x_train)；输入1个字母出结果，循环核时间展开步数为1; 表示为独热码有5个输入特征，每个时间步输入特征个数为5

print(model.trainable_variables)

显示训练集和验证集的acc和loss曲线

ABCDE字母预测（输入几个字母预测下一个字母）

Embedding

TF中Embedding实现编码的函数：

Embedding层对输入数据的维度要求

股票预测

tensorflow_6 循环神经网络

循环神经网络

循环核

循环核按时间步展开

循环计算层

TF描述计算层

循环计算过程1

ABCDE字母预测（输入一个字母预测下一个字母）

使x_train符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。

此处整个数据集送入，送入样本数为len(x_train)；输入1个字母出结果，循环核时间展开步数为1; 表示为独热码有5个输入特征，每个时间步输入特征个数为5

print(model.trainable_variables)

显示训练集和验证集的acc和loss曲线

ABCDE字母预测（输入几个字母预测下一个字母）

Embedding

TF中Embedding实现编码的函数：

Embedding层对输入数据的维度要求

股票预测

使x_train符合SimpleRNN输入要求：[送入样本数，循环核时间展开步数，每个时间步输入特征个数]。