Welcome to my GitHub
https://github.com/zq2599/blog_demos
Content: Classification and summary of all original articles and supporting source code, involving Java, Docker, Kubernetes, DevOPS, etc.;
Overview of this article
- As the third chapter of "DL4J", the goal is to create a classic LeNet-5 convolutional neural network model under the DL4J framework to train and test the MNIST data set. This chapter consists of the following:
- Introduction to LeNet-5
- Introduction to MNIST
- Introduction to the data set
- About version and environment
- coding
verify
Introduction to LeNet-5
- It is a convolutional neural network designed by Yann LeCun in 1998 for handwritten digit recognition. For example, many banks in the United States used it to recognize handwritten digits on checks. LeNet-5 is one of the most representative experimental systems of early convolutional neural networks. one
- The LeNet-5 network structure is shown in the figure below, a total of seven layers: C1 -> S2 -> C3 -> S4 -> C5 -> F6 -> OUTPUT
- This picture is more clear and clear (the original picture address: https://cuijiahua.com/blog/2018/01/dl_3.html), which can guide our coding on DL4J:
- A simple analysis according to the above figure is used to guide the next development:
- Each picture is a 28*28 single channel, the matrix should be [1, 28,28]
- C1 is the convolutional layer, the size of the convolution kernel used is 5*5, the sliding step size is 1, and the number of convolution kernels is 20, so the size change is: 28-5+1=24 (imagine that a window with a width of 5 has a width of 28 Slide in the window, how many times can you slide), the output matrix is [20,24,24]
- S2 is the pooling layer, the core size is 2*2, the step size is 2, and the type is MAX. After the pooling operation, the size is halved and becomes [20,12,12]
- C3 is the convolutional layer, the size of the convolution kernel used is 5*5, the sliding step size is 1, the number of convolution kernels is 50, so the size change is: 12-5+1=8, the output matrix [50,8,8]
- S4 is the pooling layer, the core size is 2*2, the step size is 2, and the type is MAX. After the pooling operation, the size is halved and becomes [50,4,4]
- C5 is the fully connected layer (FC), the number of neurons is 500, and the relu activation function is connected
- The last is the fully connected layer Output, there are 10 nodes in total, representing the numbers 0 to 9, and the activation function is softmax
Introduction to MNIST
- MNIST is a classic computer vision data set. The source is the National Institute of Standards and Technology (NIST, National Institute of Standards and Technology). It contains a variety of handwritten digital pictures, including 60,000 in the training set and 10,000 in the test set.
- MNIST comes from the handwriting of 250 different people, of which 50% are high school students and 50% are from the Census Bureau staff. The test set is also the same proportion of handwritten digital data
- MNIST official website: http://yann.lecun.com/exdb/mnist/
Introduction to the data set
- The original data downloaded from the MNIST official website is not a picture file. It needs to be parsed and processed according to the official format instructions to be converted into pictures. These things are obviously not the subject of this article, so we can directly use the DL4J prepared for us. The data set (download address will be given later), the data set is a piece of independent pictures, the name of the directory where these pictures are located is the specific number of the picture, as shown in the figure below, the directory <font color="blue"> 0 </font>All pictures with numbers 0:
- There are two download addresses for the above data set:
- It can be downloaded in CSDN (0 points): https://download.csdn.net/download/boling_cavalry/19846603
- github:https://raw.githubusercontent.com/zq2599/blog_download_files/master/files/mnist_png.tar.gz
- After downloading, unzip it, it is a folder named <font color="blue">mnist_png</font>, we will use it later in actual combat
About the DL4J version
- The source code of "DL4J Actual Combat" series adopts maven's parent-child project structure, and the version of DL4J is defined as <font color="red">1.0.0 in the parent project <font color="blue">dlfj-tutorials</font> -beta7</font>
- Although the code in this article is still a subproject of <font color="blue">dlfj-tutorials</font>, the DL4J version uses the lower <font color="red">1.0.0-beta6</font > The reason for this is that in the next article, we will hand over the training and testing of this article to the GPU to complete, and the corresponding CUDA library is only <font color="red">1.0.0-beta6< /font>
- After so much, you can start coding
Source download
- The complete source code of this actual combat can be downloaded on GitHub. The address and link information are shown in the following table ( https://github.com/zq2599/blog_demos):
name | Link | Remark |
---|---|---|
Project homepage | https://github.com/zq2599/blog_demos | The project's homepage on GitHub |
git warehouse address (https) | https://github.com/zq2599/blog_demos.git | The warehouse address of the source code of the project, https protocol |
git warehouse address (ssh) | git@github.com:zq2599/blog_demos.git | The warehouse address of the source code of the project, ssh protocol |
- There are multiple folders in this git project. The source code of the "DL4J Actual Combat" series is under the <font color="blue">dl4j-tutorials</font> folder, as shown in the red box below:
- There are multiple sub-projects under the <font color="blue">dl4j-tutorials</font> folder. The actual combat code is in the <font color="blue">simple-convolution</font> directory, as shown in the red box below :
coding
- Create a new subproject named <font color="red">simple-convolution</font> under the parent project <font color="blue">dl4j-tutorials</font>. Its pom.xml is as follows, see here The dl4j version is designated as <font color="red">1.0.0-beta6</font>:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>dlfj-tutorials</artifactId>
<groupId>com.bolingcavalry</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>simple-convolution</artifactId>
<properties>
<dl4j-master.version>1.0.0-beta6</dl4j-master.version>
</properties>
<dependencies>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
</dependency>
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-core</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>${nd4j.backend}</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
</dependencies>
</project>
- Next, follow the previous analysis to implement the code, and detailed comments have been added, so I won't repeat it:
package com.bolingcavalry.convolution;
import lombok.extern.slf4j.Slf4j;
import org.datavec.api.io.labels.ParentPathLabelGenerator;
import org.datavec.api.split.FileSplit;
import org.datavec.image.loader.NativeImageLoader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.inputs.InputType;
import org.deeplearning4j.nn.conf.layers.ConvolutionLayer;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.conf.layers.SubsamplingLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.deeplearning4j.util.ModelSerializer;
import org.nd4j.evaluation.classification.Evaluation;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.preprocessor.DataNormalization;
import org.nd4j.linalg.dataset.api.preprocessor.ImagePreProcessingScaler;
import org.nd4j.linalg.learning.config.Nesterovs;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import org.nd4j.linalg.schedule.MapSchedule;
import org.nd4j.linalg.schedule.ScheduleType;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.Random;
@Slf4j
public class LeNetMNISTReLu {
// 存放文件的地址,请酌情修改
// private static final String BASE_PATH = System.getProperty("java.io.tmpdir") + "/mnist";
private static final String BASE_PATH = "E:\\temp\\202106\\26";
public static void main(String[] args) throws Exception {
// 图片像素高
int height = 28;
// 图片像素宽
int width = 28;
// 因为是黑白图像,所以颜色通道只有一个
int channels = 1;
// 分类结果,0-9,共十种数字
int outputNum = 10;
// 批大小
int batchSize = 54;
// 循环次数
int nEpochs = 1;
// 初始化伪随机数的种子
int seed = 1234;
// 随机数工具
Random randNumGen = new Random(seed);
log.info("检查数据集文件夹是否存在:{}", BASE_PATH + "/mnist_png");
if (!new File(BASE_PATH + "/mnist_png").exists()) {
log.info("数据集文件不存在,请下载压缩包并解压到:{}", BASE_PATH);
return;
}
// 标签生成器,将指定文件的父目录作为标签
ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator();
// 归一化配置(像素值从0-255变为0-1)
DataNormalization imageScaler = new ImagePreProcessingScaler();
// 不论训练集还是测试集,初始化操作都是相同套路:
// 1. 读取图片,数据格式为NCHW
// 2. 根据批大小创建的迭代器
// 3. 将归一化器作为预处理器
log.info("训练集的矢量化操作...");
// 初始化训练集
File trainData = new File(BASE_PATH + "/mnist_png/training");
FileSplit trainSplit = new FileSplit(trainData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
ImageRecordReader trainRR = new ImageRecordReader(height, width, channels, labelMaker);
trainRR.initialize(trainSplit);
DataSetIterator trainIter = new RecordReaderDataSetIterator(trainRR, batchSize, 1, outputNum);
// 拟合数据(实现类中实际上什么也没做)
imageScaler.fit(trainIter);
trainIter.setPreProcessor(imageScaler);
log.info("测试集的矢量化操作...");
// 初始化测试集,与前面的训练集操作类似
File testData = new File(BASE_PATH + "/mnist_png/testing");
FileSplit testSplit = new FileSplit(testData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
ImageRecordReader testRR = new ImageRecordReader(height, width, channels, labelMaker);
testRR.initialize(testSplit);
DataSetIterator testIter = new RecordReaderDataSetIterator(testRR, batchSize, 1, outputNum);
testIter.setPreProcessor(imageScaler); // same normalization for better results
log.info("配置神经网络");
// 在训练中,将学习率配置为随着迭代阶梯性下降
Map<Integer, Double> learningRateSchedule = new HashMap<>();
learningRateSchedule.put(0, 0.06);
learningRateSchedule.put(200, 0.05);
learningRateSchedule.put(600, 0.028);
learningRateSchedule.put(800, 0.0060);
learningRateSchedule.put(1000, 0.001);
// 超参数
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
// L2正则化系数
.l2(0.0005)
// 梯度下降的学习率设置
.updater(new Nesterovs(new MapSchedule(ScheduleType.ITERATION, learningRateSchedule)))
// 权重初始化
.weightInit(WeightInit.XAVIER)
// 准备分层
.list()
// 卷积层
.layer(new ConvolutionLayer.Builder(5, 5)
.nIn(channels)
.stride(1, 1)
.nOut(20)
.activation(Activation.IDENTITY)
.build())
// 下采样,即池化
.layer(new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
.kernelSize(2, 2)
.stride(2, 2)
.build())
// 卷积层
.layer(new ConvolutionLayer.Builder(5, 5)
.stride(1, 1) // nIn need not specified in later layers
.nOut(50)
.activation(Activation.IDENTITY)
.build())
// 下采样,即池化
.layer(new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
.kernelSize(2, 2)
.stride(2, 2)
.build())
// 稠密层,即全连接
.layer(new DenseLayer.Builder().activation(Activation.RELU)
.nOut(500)
.build())
// 输出
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.nOut(outputNum)
.activation(Activation.SOFTMAX)
.build())
.setInputType(InputType.convolutionalFlat(height, width, channels)) // InputType.convolutional for normal image
.build();
MultiLayerNetwork net = new MultiLayerNetwork(conf);
net.init();
// 每十个迭代打印一次损失函数值
net.setListeners(new ScoreIterationListener(10));
log.info("神经网络共[{}]个参数", net.numParams());
long startTime = System.currentTimeMillis();
// 循环操作
for (int i = 0; i < nEpochs; i++) {
log.info("第[{}]个循环", i);
net.fit(trainIter);
Evaluation eval = net.evaluate(testIter);
log.info(eval.stats());
trainIter.reset();
testIter.reset();
}
log.info("完成训练和测试,耗时[{}]毫秒", System.currentTimeMillis()-startTime);
// 保存模型
File ministModelPath = new File(BASE_PATH + "/minist-model.zip");
ModelSerializer.writeModel(net, ministModelPath, true);
log.info("最新的MINIST模型保存在[{}]", ministModelPath.getPath());
}
}
- Executing the above code, the log output is as follows, the training and testing are successfully completed, and the accuracy rate reaches 0.9886:
21:19:15.355 [main] INFO org.deeplearning4j.optimize.listeners.ScoreIterationListener - Score at iteration 1110 is 0.18300625613640034
21:19:15.365 [main] DEBUG org.nd4j.linalg.dataset.AsyncDataSetIterator - Manually destroying ADSI workspace
21:19:16.632 [main] DEBUG org.nd4j.linalg.dataset.AsyncDataSetIterator - Manually destroying ADSI workspace
21:19:16.642 [main] INFO com.bolingcavalry.convolution.LeNetMNISTReLu -
========================Evaluation Metrics========================
# of classes: 10
Accuracy: 0.9886
Precision: 0.9885
Recall: 0.9886
F1 Score: 0.9885
Precision, recall & F1: macro-averaged (equally weighted avg. of 10 classes)
=========================Confusion Matrix=========================
0 1 2 3 4 5 6 7 8 9
---------------------------------------------------
972 0 0 0 0 0 2 2 2 2 | 0 = 0
0 1126 0 3 0 2 1 1 2 0 | 1 = 1
1 1 1019 2 0 0 0 6 3 0 | 2 = 2
0 0 1 1002 0 5 0 1 1 0 | 3 = 3
0 0 2 0 971 0 3 2 1 3 | 4 = 4
0 0 0 3 0 886 2 1 0 0 | 5 = 5
6 2 0 1 1 5 942 0 1 0 | 6 = 6
0 1 6 0 0 0 0 1015 1 5 | 7 = 7
1 0 1 1 0 2 0 2 962 5 | 8 = 8
1 2 1 3 5 3 0 2 1 991 | 9 = 9
Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times
==================================================================
21:19:16.643 [main] INFO com.bolingcavalry.convolution.LeNetMNISTReLu - 完成训练和测试,耗时[27467]毫秒
21:19:17.019 [main] INFO com.bolingcavalry.convolution.LeNetMNISTReLu - 最新的MINIST模型保存在[E:\temp\202106\26\minist-model.zip]
Process finished with exit code 0
About accuracy
- The previous test results show that the accuracy rate is <font color="blue">0.9886</font>, which is the training result of the <font color="red">1.0.0-beta6</font> version of DL4J. If you change it to <font color="red">1.0.0-beta7</font>, the accuracy rate can reach <font color="blue">0.99</font> or higher, you can try it;
- At this point, the actual combat of the classic convolution under the DL4J framework has been completed. Up to now, our training and testing work is done by CPU. The increase in CPU usage during work is very obvious. In the next article, we will hand over today’s work. Give the GPU a try to see if you can use CUDA to accelerate the training and testing work;
You are not lonely, Xinchen is with you all the way
- Java series
- Spring series
- Docker series
- kubernetes series
- database + middleware series
- DevOps series
Welcome to pay attention to the public account: programmer Xin Chen
Search "Programmer Xin Chen" on WeChat, I am Xin Chen, and I look forward to traveling the Java world with you...
https://github.com/zq2599/blog_demos
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。