说明:这篇文章来自 2020-2021 年的 WordPress 备份。原始图片附件未随 XML 一起保存,旧服务器图片地址也已失效,因此这里保留正文并移除了失效图片。
卷积神经网络概念
卷积(Convolutional)
• 卷积计算可认为是一种有效提取图像特征的方法 • 一般会用一个正方形的卷积核,按指定步长,在输入特征图上滑动,遍历输入特征图中的每个像素点。每一个步长,卷积核会与输入特征图出现重合区域,重合区域对应元素相乘、求和再加上偏置项,得到输出特征的一个像素点
- 输入特征图的深度(channel数),决定了当前层卷积核的深度;
- 当前层卷积核的个数,决定了当前层输出特征图的深度。
感受野(Receptive Field)
卷积神经网络各输出特征图中的每个像素点,在原始输入图片上映射区域的大小。
如图,一个5 5像素点的图像经过两层3 3卷积核作用与一层5 5卷积核作用都得到一个感受野是5的输出特征图 带训练参数与计算量: 设输入特征图宽、高为x,卷积计算步长为1 3 3:参数量:9+9=18 计算量:18x2 - 108x + 180 5 5:参数量:25 计算量:25x2 - 200x + 400 当x>10 时,两层3 3卷积核 优于 一层5 * 5卷积核
全零填充(Padding)
若希望卷积计算保持输入特征图的尺寸不变,可以使用全零填充,在输入特征图周围填充0,输出特征图边长等于输入特征图边长除以步长(向上取整) 不全零填充时,输出特征图边长等于输入特征图边长减核长加1除步长
TF描述全零填充 用参数padding = ‘SAME’ 或 padding = ‘VALID’表示
TF描述卷积层
tf计算卷积的函数:
tf.keras.layers.Conv2D (
filters = 卷积核个数,
kernel_size = 卷积核尺寸, #正方形写核长整数,或(核高h,核宽w)
strides = 滑动步长, #横纵向相同写步长整数,或(纵向步长h,横向步长w),默认1
padding = “same” or “valid”, #使用全零填充是“same”,不使用是“valid”(默认)
activation = “ relu ” or “ sigmoid ” or “ tanh ” or “ softmax”等 , #选用什么激活函数,如有BN(Batch Normalization批量标准化操作)此处不写
input_shape = (高, 宽 , 通道数) #输入特征图维度,可省略
)
例如:
model = tf.keras.models.Sequential([
Conv2D(6, 5, padding='valid', activation='sigmoid'),
MaxPool2D(2, 2),
Conv2D(6, (5, 5), padding='valid', activation='sigmoid'),
MaxPool2D(2, (2, 2)),
Conv2D(filters=6, kernel_size=(5, 5),padding='valid', activation='sigmoid'),
MaxPool2D(pool_size=(2, 2), strides=2),
Flatten(),
Dense(10, activation='softmax')
])
批标准化(Batch Normalization, BN)
随着网络层数的增加,特征数据会出现偏离均值0的情况,标准化可以使数据符合以0为均值,1为标准差的正态分布 标准化:使数据符合0均值,1为标准差的分布。 批标准化:对一小批数据(batch),做标准化处理 。 批标准化后,第 k个卷积核的输出特征图(feature map)中第 i 个像素点 $H_i^{'k} = \frac{H_i^k-\tau_batch^k}{\sigma_batch^k}$ $H_i^k$:批标准化前,第k个卷积核,输出特征图中第 i 个像素点 $\tau_batch^k$:批标准化前,第k个卷积核,batch张输出特征图中所有像素点平均值 $\sigma_batch^k$:批标准化前,第k个卷积核,batch张输出特征图中所有像素点标准差
批标准化前后:
BN操作将原本偏移的特征数据重新拉回到0均值,使进入激活函数的数据分布在激活函数线性区,使得输入数据的微小变化更明显的体现到激活函数的输出,提升了激活函数对输入数据的区分力,但是这种简单的特征数据标准化,使特征数据完全满足标准正态分布,集中在激活函数中心的线性区域,使激活函数丧失了非线性特性,因此在BN操作中为每个卷积核引入了两个可训练参数,缩放因$\gamma$和偏移因子$\beta$. $X_i^k = \gamma_k H_i^{‘k} + \beta_k$ 反向传播时,这两个因子会与其他待训练参数一同被训练优化,使标准正态分布后的特征数据通过这两个因子优化了特征数据分布的宽窄和偏移量,保证了网络的非线性表达力。
TF中BN操作的函数: tf.keras.layers.BatchNormalization() 例如:
model = tf.keras.models.Sequential([
Conv2D(filters=6, kernel_size=(5, 5), padding='same'), # 卷积层
BatchNormalization(), # BN层
Activation('relu'), # 激活层
MaxPool2D(pool_size=(2, 2), strides=2, padding='same'), # 池化层
Dropout(0.2), # dropout层
池化(Pooling)
池化用于减少特征数据量。有最大值池化和均值池化。 最大值池化可提取图片纹理,均值池化可保留背景特征。 池化过程:
TF池化函数:
- 最大值池化
tf.keras.layers.MaxPool2D( pool_size=池化核尺寸,#正方形写核长整数,或(核高h,核宽w) strides=池化步长,#步长整数, 或(纵向步长h,横向步长w),默认为pool_size padding=‘valid’or‘same’ #使用全零填充是“same”,不使用是“valid”(默认) )均值池化
tf.keras.layers.AveragePooling2D( pool_size=池化核尺寸,#正方形写核长整数,或(核高h,核宽w) strides=池化步长,#步长整数, 或(纵向步长h,横向步长w),默认为pool_size padding=‘valid’or‘same’ #使用全零填充是“same”,不使用是“valid”(默认) )实例:
model = tf.keras.models.Sequential([ Conv2D(filters=6, kernel_size=(5, 5), padding='same'), # 卷积层 BatchNormalization(), # BN层 Activation('relu'), # 激活层 MaxPool2D(pool_size=(2, 2), strides=2, padding='same'), # 池化层 Dropout(0.2), # dropout层 ])
舍弃(Dropout)
为了缓解神经网络过拟合,在神经网络训练时,将一部分神经元按照一定概率从神经网络中暂时舍弃。神经网络使用时,被舍弃的神经元恢复链接。 TF池化函数:tf.keras.layers.Dropout(舍弃的概率) 实例见上代码 Dropout(0.2), # dropout层 0.2表示随机舍弃掉20%的神经元
卷积神经网络
本质
卷积神经网络:借助卷积核提取特征后,送入全连接网络。
- 卷积神经网络网络的主要模块: 卷积、批标准化、激活、池化 通过以上四个模块对输入特征进行特征提取
- 卷积是什么?
卷积就是特征提取器,就是CBAPD !!!!!
model = tf.keras.models.Sequential([ Conv2D(filters=6, kernel_size=(5, 5), padding='same'), # 卷积层 C BatchNormalization(), # BN层 B Activation('relu'), # 激活层 A MaxPool2D(pool_size=(2, 2), strides=2, padding='same'), # 池化层 P Dropout(0.2), # dropout层 D ])Cifar10数据集:
提供 5万张 3232 像素点的十分类彩色图片和标签,用于训练。 提供 1万张 3232 像素点的十分类彩色图片和标签,用于测试。 每张图片有32*32列像素点的rgb三通道数据 导入cifar10数据集: cifar10 = tf.keras.datasets.cifar10 (x_train, y_train),(x_test, y_test) = cifar10.load_data() 其查看数据集shape以及图片方法和上文 mnist一样
神经网络搭建
用卷积神经网络训练cifar10数据集,搭建一个一层卷积、两层全连接的网络,使用6个5 5的卷积核,过2 2的池化核,过128个神经元的全连接层,由于cifar10是十分类,最后还要过一层十个神经元的全连接层。简化过程如下:
- C(核:655,步长:1,填充:same )
- B(Yes)
- A(relu)
- P(max,核:2*2,步长:2,填充:same)
- D(0.2)
- Flatten
- Dense(神经元:128,激活:relu,Dropout:0.2)
- Dense(神经元:10,激活:softmax)
卷积神经网络训练Cifar10数据集baseline:
import tensorflow as tf import os import numpy as np from matplotlib import pyplot as plt from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense from tensorflow.keras import Modelnp.set_printoptions(threshold=np.inf)
cifar10 = tf.keras.datasets.cifar10 (x_train, y_train), (x_test, y_test) = cifar10.load_data() x_train, x_test = x_train / 255.0, x_test / 255.0
#由于网络相对复杂,所以用class类搭建网络结构 class Baseline(Model): def init(self): super(Baseline, self).init() self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding=‘same’) # 卷积层 self.b1 = BatchNormalization() # BN层 self.a1 = Activation(‘relu’) # 激活层 self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding=‘same’) # 池化层 self.d1 = Dropout(0.2) # dropout层
self.flatten = Flatten() self.f1 = Dense(128, activation='relu') self.d2 = Dropout(0.2) self.f2 = Dense(10, activation='softmax') def call(self, x): x = self.c1(x) x = self.b1(x) x = self.a1(x) x = self.p1(x) x = self.d1(x) x = self.flatten(x) x = self.f1(x) x = self.d2(x) y = self.f2(x) return ymodel = Baseline()
model.compile(optimizer=‘adam’, loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False), metrics=[‘sparse_categorical_accuracy’])
checkpoint_save_path = ”./checkpoint/Baseline.ckpt” if os.path.exists(checkpoint_save_path + ‘.index’): print(‘-------------load the model-----------------’) model.load_weights(checkpoint_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True, save_best_only=True)
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback]) model.summary()
print(model.trainable_variables)
file = open(’./weights.txt’, ‘w’) for v in model.trainable_variables: file.write(str(v.name) + ‘\n’) file.write(str(v.shape) + ‘\n’) file.write(str(v.numpy()) + ‘\n’) file.close() ############################################### show ###############################################
显示训练集和验证集的acc和loss曲线
acc = history.history[‘sparse_categorical_accuracy’] val_acc = history.history[‘val_sparse_categorical_accuracy’] loss = history.history[‘loss’] val_loss = history.history[‘val_loss’]
plt.subplot(1, 2, 1) plt.plot(acc, label=‘Training Accuracy’) plt.plot(val_acc, label=‘Validation Accuracy’) plt.title(‘Training and Validation Accuracy’) plt.legend()
plt.subplot(1, 2, 2) plt.plot(loss, label=‘Training Loss’) plt.plot(val_loss, label=‘Validation Loss’) plt.title(‘Training and Validation Loss’) plt.legend() plt.show()
下面在介绍经典卷积神经网络结构时,只替换 class Baseline()部分,其余代码不变。
经典卷积神经网络
LeNet
网络结构:
两层卷积层,三层全连接 前两层卷积 输入:32 32 3 C(核:6 5 5,步长:1,填充:valid ) B(None) A(sigmoid) P(max,核:2 * 2,步长:2,填充:valid ) D(None)
C(核:16 5 5,步长:1,填充: valid ) B(None) A(sigmoid) P(max,核:2 * 2,步长:2,填充:valid ) D(None) 全连接: Flatten Dense(神经元:120,激活:sigmoid) Dense(神经元:84,激活:sigmoid) Dense(神经元:10,激活:softmax) 网络结构代码:
class LeNet5(Model):
def __init__(self):
super(LeNet5, self).__init__()
self.c1 = Conv2D(filters=6, kernel_size=(5, 5),
activation='sigmoid')
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)
self.c2 = Conv2D(filters=16, kernel_size=(5, 5),
activation='sigmoid')
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)
self.flatten = Flatten()
self.f1 = Dense(120, activation='sigmoid')
self.f2 = Dense(84, activation='sigmoid')
self.f3 = Dense(10, activation='softmax')</code></pre>
AlexNet
八层 五层卷积,三层全连接
class AlexNet8(Model):
def __init__(self):
super(AlexNet8, self).__init__()
self.c1 = Conv2D(filters=96, kernel_size=(3, 3))
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.p1 = MaxPool2D(pool_size=(3, 3), strides=2)
self.c2 = Conv2D(filters=256, kernel_size=(3, 3))
self.b2 = BatchNormalization()
self.a2 = Activation('relu')
self.p2 = MaxPool2D(pool_size=(3, 3), strides=2)
self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',
activation='relu')
self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',
activation='relu')
self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same',
activation='relu')
self.p3 = MaxPool2D(pool_size=(3, 3), strides=2)
self.flatten = Flatten()
self.f1 = Dense(2048, activation='relu')
self.d1 = Dropout(0.5)
self.f2 = Dense(2048, activation='relu')
self.d2 = Dropout(0.5)
self.f3 = Dense(10, activation='softmax')</code></pre>
VGGNet
VGGNet使用小尺寸卷积核,在减少参数的同时,提高了识别准确率
class VGG16(Model):
def __init__(self):
super(VGG16, self).__init__()
self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding='same') # 卷积层1
self.b1 = BatchNormalization() # BN层1
self.a1 = Activation('relu') # 激活层1
self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding='same', )
self.b2 = BatchNormalization() # BN层1
self.a2 = Activation('relu') # 激活层1
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d1 = Dropout(0.2) # dropout层
self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b3 = BatchNormalization() # BN层1
self.a3 = Activation('relu') # 激活层1
self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b4 = BatchNormalization() # BN层1
self.a4 = Activation('relu') # 激活层1
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d2 = Dropout(0.2) # dropout层
self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b5 = BatchNormalization() # BN层1
self.a5 = Activation('relu') # 激活层1
self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b6 = BatchNormalization() # BN层1
self.a6 = Activation('relu') # 激活层1
self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b7 = BatchNormalization()
self.a7 = Activation('relu')
self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d3 = Dropout(0.2)
self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b8 = BatchNormalization() # BN层1
self.a8 = Activation('relu') # 激活层1
self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b9 = BatchNormalization() # BN层1
self.a9 = Activation('relu') # 激活层1
self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b10 = BatchNormalization()
self.a10 = Activation('relu')
self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d4 = Dropout(0.2)
self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b11 = BatchNormalization() # BN层1
self.a11 = Activation('relu') # 激活层1
self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b12 = BatchNormalization() # BN层1
self.a12 = Activation('relu') # 激活层1
self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b13 = BatchNormalization()
self.a13 = Activation('relu')
self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d5 = Dropout(0.2)
self.flatten = Flatten()
self.f1 = Dense(512, activation='relu')
self.d6 = Dropout(0.2)
self.f2 = Dense(512, activation='relu')
self.d7 = Dropout(0.2)
self.f3 = Dense(10, activation='softmax')</code></pre>
Inception Net
Inception Net引入了Inception结构块,在同一层网络内使用不同尺寸的卷积核,提升了模型感知力,使用了批标准化,缓解了梯度消失
卷积连接器把这四个分支按照深度方向堆叠在一起,构成Inception结构块的输出
ResNet
ResNet提出了层间残差跳连,引入了前方信息,缓解梯度消失,使神经网络层数增成为可能
如图,将前边的特征直接接到了后边,使这里的输出结果H(x)包含了堆叠卷积的非线性输出F(x)和跳过这两层堆叠卷积直接连接过来的恒等映射x,让他们对应元素相加,有效缓解了神经网络模型堆叠导致的退化,使神经网络可以想着更深层级发展。
- ResNet块:
ResNet块中有两种情况,一种情况用实线表示,两层堆叠卷积没有改变特征图的维度,可直接将F(x)与x相加
另一种情况用虚线表示,两层堆叠卷积改变了特征图的维度,需要借助1 * 1的卷积来调整x的维度
- 网络结构:
ResNet18的第一层是个卷积,然后是8个ResNet块,最后一个全连接层,每一个ResNet块有两层卷积,一共18层网络。
网络结构代码:
class ResnetBlock(Model):
def __init__(self, filters, strides=1, residual_path=False):
super(ResnetBlock, self).__init__()
self.filters = filters
self.strides = strides
self.residual_path = residual_path
self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b2 = BatchNormalization()
# residual_path为True时,对输入进行下采样,即用1x1的卷积核做卷积操作,保证x能和F(x)维度相同,顺利相加
if residual_path:
self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
self.down_b1 = BatchNormalization()
self.a2 = Activation('relu')
def call(self, inputs):
residual = inputs # residual等于输入值本身,即residual=x
# 将输入通过卷积、BN层、激活层,计算F(x)
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.c2(x)
y = self.b2(x)
if self.residual_path:
residual = self.down_c1(inputs)
residual = self.down_b1(residual)
out = self.a2(y + residual) # 最后输出的是两部分的和,即F(x)+x或F(x)+Wx,再过激活函数
return out
class ResNet18(Model):
def __init__(self, block_list, initial_filters=64): # block_list表示每个block有几个卷积层
super(ResNet18, self).__init__()
self.num_blocks = len(block_list) # 共有几个block
self.block_list = block_list
self.out_filters = initial_filters
self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.blocks = tf.keras.models.Sequential()
# 构建ResNet网络结构
for block_id in range(len(block_list)): # 第几个resnet block
for layer_id in range(block_list[block_id]): # 第几个卷积层
if block_id != 0 and layer_id == 0: # 对除第一个block以外的每个block的输入进行下采样
block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
else:
block = ResnetBlock(self.out_filters, residual_path=False)
self.blocks.add(block) # 将构建好的block加入resnet
self.out_filters *= 2 # 下一个block的卷积核数是上一个block的2倍
self.p1 = tf.keras.layers.GlobalAveragePooling2D()#平均全局池化
self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())
def call(self, inputs):
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y</code></pre>