MNIST
50000 个训练数据
10000 个测试数据
图像大小 28x28
10 类(0~9)
LeNet
LeNet(LeNet-5)由两个部分组成:
卷积编码器:由两个卷积层组成
全连接层密集块:由三个全连接层组成
每个卷积块中的基本单元是一个卷积层、一个 sigmoid 激活函数和平均池化层。
每个卷积层使用 5×5 卷积核和一个 sigmoid 激活函数。
每个 2×2 池操作通过空间下采样将维数减少 4 倍。
总结
LeNet 是早期成功的卷积神经网络
先使用卷积层来学习图片空间信息
为了构造高性能的卷积神经网络,通常对卷积层进行排列,逐渐降低其表示的空间分辨率,同时增加通道数
使用全连接层来转换到类别的空间
代码实现
LeNet(LeNet-5)由两个部分组成:
卷积编码器和全连接层密集块
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 import torchfrom torch import nnfrom d2l import torch as d2lclass Reshape (torch.nn.Module ): def forward (self, x ): return x.view(-1 , 1 , 28 , 28 ) net = torch.nn.Sequential( Reshape(), nn.Conv2d(1 , 6 , kernel_size=5 , padding=2 ), nn.Sigmoid(), nn.AvgPool2d(kernel_size=2 , stride=2 ), nn.Conv2d(6 , 16 , kernel_size=5 ), nn.Sigmoid(), nn.AvgPool2d(kernel_size=2 , stride=2 ), nn.Flatten(), nn.Linear(16 * 5 * 5 , 120 ), nn.Sigmoid(), nn.Linear(120 , 84 ), nn.Sigmoid(), nn.Linear(84 , 10 ))
检查模型
1 2 3 4 X = torch.rand(size=(1 , 1 , 28 , 28 ), dtype=torch.float32) for layer in net: X = layer(X) print(layer.__class__.__name__,'output shape: \t' ,X.shape)
1 2 3 4 5 6 7 8 9 10 11 12 13 Reshape output shape: torch.Size([1, 1, 28, 28]) Conv2d output shape: torch.Size([1, 6, 28, 28]) Sigmoid output shape: torch.Size([1, 6, 28, 28]) AvgPool2d output shape: torch.Size([1, 6, 14, 14]) Conv2d output shape: torch.Size([1, 16, 10, 10]) Sigmoid output shape: torch.Size([1, 16, 10, 10]) AvgPool2d output shape: torch.Size([1, 16, 5, 5]) Flatten output shape: torch.Size([1, 400]) Linear output shape: torch.Size([1, 120]) Sigmoid output shape: torch.Size([1, 120]) Linear output shape: torch.Size([1, 84]) Sigmoid output shape: torch.Size([1, 84]) Linear output shape: torch.Size([1, 10])
LeNet在Fashion-MNIST数据集上的表现
1 2 batch_size = 256 train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=batch_size)
对evaluate_accuracy
函数进行轻微的修改
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 def evaluate_accuracy_gpu (net, data_iter, device=None ): """使用GPU计算模型在数据集上的精度。""" if isinstance (net, torch.nn.Module): net.eval () if not device: device = next (iter (net.parameters())).device metric = d2l.Accumulator(2 ) for X, y in data_iter: if isinstance (X, list ): X = [x.to(device) for x in X] else : X = X.to(device) y = y.to(device) metric.add(d2l.accuracy(net(X), y), y.numel()) return metric[0 ] / metric[1 ]
为了使用 GPU,我们还需要一点小改动
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 def train_ch6 (net, train_iter, test_iter, num_epochs, lr, device ): """用GPU训练模型。""" def init_weights (m ): if type (m) == nn.Linear or type (m) == nn.Conv2d: nn.init.xavier_uniform_(m.weight) net.apply(init_weights) print('training on' , device) net.to(device) optimizer = torch.optim.SGD(net.parameters(), lr=lr) loss = nn.CrossEntropyLoss() animator = d2l.Animator(xlabel='epoch' , xlim=[1 , num_epochs], legend=['train loss' , 'train acc' , 'test acc' ]) timer, num_batches = d2l.Timer(), len (train_iter) for epoch in range (num_epochs): metric = d2l.Accumulator(3 ) net.train() for i, (X, y) in enumerate (train_iter): timer.start() optimizer.zero_grad() X, y = X.to(device), y.to(device) y_hat = net(X) l = loss(y_hat, y) l.backward() optimizer.step() with torch.no_grad(): metric.add(l * X.shape[0 ], d2l.accuracy(y_hat, y), X.shape[0 ]) timer.stop() train_l = metric[0 ] / metric[2 ] train_acc = metric[1 ] / metric[2 ] if (i + 1 ) % (num_batches // 5 ) == 0 or i == num_batches - 1 : animator.add(epoch + (i + 1 ) / num_batches, (train_l, train_acc, None )) test_acc = evaluate_accuracy_gpu(net, test_iter) animator.add(epoch + 1 , (None , None , test_acc)) print(f'loss {train_l:.3 f} , train acc {train_acc:.3 f} , ' f'test acc {test_acc:.3 f} ' ) print(f'{metric[2 ] * num_epochs / timer.sum ():.1 f} examples/sec ' f'on {str (device)} ' )
训练和评估LeNet-5模型
1 2 lr, num_epochs = 0.9 , 10 train_ch6(net, train_iter, test_iter, num_epochs, lr, d2l.try_gpu())
1 2 loss 0.460, train acc 0.828, test acc 0.828 67659.4 examples/sec on cuda:0