Welcome to my GitHub

https://github.com/zq2599/blog_demos

Content: Classification and summary of all original articles and supporting source code, involving Java, Docker, Kubernetes, DevOPS, etc.;

Overview of this article

  • As the third chapter of "DL4J", the goal is to create a classic LeNet-5 convolutional neural network model under the DL4J framework to train and test the MNIST data set. This chapter consists of the following:
  • Introduction to LeNet-5
  • Introduction to MNIST
  • Introduction to the data set
  • About version and environment
  • coding
  • verify

    Introduction to LeNet-5

  • It is a convolutional neural network designed by Yann LeCun in 1998 for handwritten digit recognition. For example, many banks in the United States used it to recognize handwritten digits on checks. LeNet-5 is one of the most representative experimental systems of early convolutional neural networks. one
  • The LeNet-5 network structure is shown in the figure below, a total of seven layers: C1 -> S2 -> C3 -> S4 -> C5 -> F6 -> OUTPUT

在这里插入图片描述

在这里插入图片描述

  • A simple analysis according to the above figure is used to guide the next development:
  • Each picture is a 28*28 single channel, the matrix should be [1, 28,28]
  • C1 is the convolutional layer, the size of the convolution kernel used is 5*5, the sliding step size is 1, and the number of convolution kernels is 20, so the size change is: 28-5+1=24 (imagine that a window with a width of 5 has a width of 28 Slide in the window, how many times can you slide), the output matrix is [20,24,24]
  • S2 is the pooling layer, the core size is 2*2, the step size is 2, and the type is MAX. After the pooling operation, the size is halved and becomes [20,12,12]
  • C3 is the convolutional layer, the size of the convolution kernel used is 5*5, the sliding step size is 1, the number of convolution kernels is 50, so the size change is: 12-5+1=8, the output matrix [50,8,8]
  • S4 is the pooling layer, the core size is 2*2, the step size is 2, and the type is MAX. After the pooling operation, the size is halved and becomes [50,4,4]
  • C5 is the fully connected layer (FC), the number of neurons is 500, and the relu activation function is connected
  • The last is the fully connected layer Output, there are 10 nodes in total, representing the numbers 0 to 9, and the activation function is softmax

Introduction to MNIST

  • MNIST is a classic computer vision data set. The source is the National Institute of Standards and Technology (NIST, National Institute of Standards and Technology). It contains a variety of handwritten digital pictures, including 60,000 in the training set and 10,000 in the test set.
  • MNIST comes from the handwriting of 250 different people, of which 50% are high school students and 50% are from the Census Bureau staff. The test set is also the same proportion of handwritten digital data
  • MNIST official website: http://yann.lecun.com/exdb/mnist/

Introduction to the data set

  • The original data downloaded from the MNIST official website is not a picture file. It needs to be parsed and processed according to the official format instructions to be converted into pictures. These things are obviously not the subject of this article, so we can directly use the DL4J prepared for us. The data set (download address will be given later), the data set is a piece of independent pictures, the name of the directory where these pictures are located is the specific number of the picture, as shown in the figure below, the directory <font color="blue"> 0 </font>All pictures with numbers 0:

在这里插入图片描述

About the DL4J version

  • The source code of "DL4J Actual Combat" series adopts maven's parent-child project structure, and the version of DL4J is defined as <font color="red">1.0.0 in the parent project <font color="blue">dlfj-tutorials</font> -beta7</font>
  • Although the code in this article is still a subproject of <font color="blue">dlfj-tutorials</font>, the DL4J version uses the lower <font color="red">1.0.0-beta6</font > The reason for this is that in the next article, we will hand over the training and testing of this article to the GPU to complete, and the corresponding CUDA library is only <font color="red">1.0.0-beta6< /font>
  • After so much, you can start coding

Source download

nameLinkRemark
Project homepagehttps://github.com/zq2599/blog_demosThe project's homepage on GitHub
git warehouse address (https)https://github.com/zq2599/blog_demos.gitThe warehouse address of the source code of the project, https protocol
git warehouse address (ssh)git@github.com:zq2599/blog_demos.gitThe warehouse address of the source code of the project, ssh protocol
  • There are multiple folders in this git project. The source code of the "DL4J Actual Combat" series is under the <font color="blue">dl4j-tutorials</font> folder, as shown in the red box below:

在这里插入图片描述

  • There are multiple sub-projects under the <font color="blue">dl4j-tutorials</font> folder. The actual combat code is in the <font color="blue">simple-convolution</font> directory, as shown in the red box below :

在这里插入图片描述

coding

  • Create a new subproject named <font color="red">simple-convolution</font> under the parent project <font color="blue">dl4j-tutorials</font>. Its pom.xml is as follows, see here The dl4j version is designated as <font color="red">1.0.0-beta6</font>:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <artifactId>dlfj-tutorials</artifactId>
        <groupId>com.bolingcavalry</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <modelVersion>4.0.0</modelVersion>

    <artifactId>simple-convolution</artifactId>

    <properties>
        <dl4j-master.version>1.0.0-beta6</dl4j-master.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
        </dependency>

        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
        </dependency>

        <dependency>
            <groupId>org.deeplearning4j</groupId>
            <artifactId>deeplearning4j-core</artifactId>
            <version>${dl4j-master.version}</version>
        </dependency>

        <dependency>
            <groupId>org.nd4j</groupId>
            <artifactId>${nd4j.backend}</artifactId>
            <version>${dl4j-master.version}</version>
        </dependency>
    </dependencies>
</project>
  • Next, follow the previous analysis to implement the code, and detailed comments have been added, so I won't repeat it:
package com.bolingcavalry.convolution;

import lombok.extern.slf4j.Slf4j;
import org.datavec.api.io.labels.ParentPathLabelGenerator;
import org.datavec.api.split.FileSplit;
import org.datavec.image.loader.NativeImageLoader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.inputs.InputType;
import org.deeplearning4j.nn.conf.layers.ConvolutionLayer;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.conf.layers.SubsamplingLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.deeplearning4j.util.ModelSerializer;
import org.nd4j.evaluation.classification.Evaluation;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.preprocessor.DataNormalization;
import org.nd4j.linalg.dataset.api.preprocessor.ImagePreProcessingScaler;
import org.nd4j.linalg.learning.config.Nesterovs;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import org.nd4j.linalg.schedule.MapSchedule;
import org.nd4j.linalg.schedule.ScheduleType;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.Random;

@Slf4j
public class LeNetMNISTReLu {

    // 存放文件的地址,请酌情修改
//    private static final String BASE_PATH = System.getProperty("java.io.tmpdir") + "/mnist";

    private static final String BASE_PATH = "E:\\temp\\202106\\26";

    public static void main(String[] args) throws Exception {
        // 图片像素高
        int height = 28;
        // 图片像素宽
        int width = 28;
        // 因为是黑白图像,所以颜色通道只有一个
        int channels = 1;
        // 分类结果,0-9,共十种数字
        int outputNum = 10;
        // 批大小
        int batchSize = 54;
        // 循环次数
        int nEpochs = 1;
        // 初始化伪随机数的种子
        int seed = 1234;

        // 随机数工具
        Random randNumGen = new Random(seed);
        
        log.info("检查数据集文件夹是否存在:{}", BASE_PATH + "/mnist_png");

        if (!new File(BASE_PATH + "/mnist_png").exists()) {
            log.info("数据集文件不存在,请下载压缩包并解压到:{}", BASE_PATH);
            return;
        }

        // 标签生成器,将指定文件的父目录作为标签
        ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator();
        // 归一化配置(像素值从0-255变为0-1)
        DataNormalization imageScaler = new ImagePreProcessingScaler();

        // 不论训练集还是测试集,初始化操作都是相同套路:
        // 1. 读取图片,数据格式为NCHW
        // 2. 根据批大小创建的迭代器
        // 3. 将归一化器作为预处理器

        log.info("训练集的矢量化操作...");
        // 初始化训练集
        File trainData = new File(BASE_PATH + "/mnist_png/training");
        FileSplit trainSplit = new FileSplit(trainData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
        ImageRecordReader trainRR = new ImageRecordReader(height, width, channels, labelMaker);
        trainRR.initialize(trainSplit);
        DataSetIterator trainIter = new RecordReaderDataSetIterator(trainRR, batchSize, 1, outputNum);
        // 拟合数据(实现类中实际上什么也没做)
        imageScaler.fit(trainIter);
        trainIter.setPreProcessor(imageScaler);

        log.info("测试集的矢量化操作...");
        // 初始化测试集,与前面的训练集操作类似
        File testData = new File(BASE_PATH + "/mnist_png/testing");
        FileSplit testSplit = new FileSplit(testData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
        ImageRecordReader testRR = new ImageRecordReader(height, width, channels, labelMaker);
        testRR.initialize(testSplit);
        DataSetIterator testIter = new RecordReaderDataSetIterator(testRR, batchSize, 1, outputNum);
        testIter.setPreProcessor(imageScaler); // same normalization for better results

        log.info("配置神经网络");

        // 在训练中,将学习率配置为随着迭代阶梯性下降
        Map<Integer, Double> learningRateSchedule = new HashMap<>();
        learningRateSchedule.put(0, 0.06);
        learningRateSchedule.put(200, 0.05);
        learningRateSchedule.put(600, 0.028);
        learningRateSchedule.put(800, 0.0060);
        learningRateSchedule.put(1000, 0.001);

        // 超参数
        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
            .seed(seed)
            // L2正则化系数
            .l2(0.0005)
            // 梯度下降的学习率设置
            .updater(new Nesterovs(new MapSchedule(ScheduleType.ITERATION, learningRateSchedule)))
            // 权重初始化
            .weightInit(WeightInit.XAVIER)
            // 准备分层
            .list()
            // 卷积层
            .layer(new ConvolutionLayer.Builder(5, 5)
                .nIn(channels)
                .stride(1, 1)
                .nOut(20)
                .activation(Activation.IDENTITY)
                .build())
            // 下采样,即池化
            .layer(new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
                .kernelSize(2, 2)
                .stride(2, 2)
                .build())
            // 卷积层
            .layer(new ConvolutionLayer.Builder(5, 5)
                .stride(1, 1) // nIn need not specified in later layers
                .nOut(50)
                .activation(Activation.IDENTITY)
                .build())
            // 下采样,即池化
            .layer(new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
                .kernelSize(2, 2)
                .stride(2, 2)
                .build())
            // 稠密层,即全连接
            .layer(new DenseLayer.Builder().activation(Activation.RELU)
                .nOut(500)
                .build())
            // 输出
            .layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                .nOut(outputNum)
                .activation(Activation.SOFTMAX)
                .build())
            .setInputType(InputType.convolutionalFlat(height, width, channels)) // InputType.convolutional for normal image
            .build();

        MultiLayerNetwork net = new MultiLayerNetwork(conf);
        net.init();

        // 每十个迭代打印一次损失函数值
        net.setListeners(new ScoreIterationListener(10));

        log.info("神经网络共[{}]个参数", net.numParams());

        long startTime = System.currentTimeMillis();
        // 循环操作
        for (int i = 0; i < nEpochs; i++) {
            log.info("第[{}]个循环", i);
            net.fit(trainIter);
            Evaluation eval = net.evaluate(testIter);
            log.info(eval.stats());
            trainIter.reset();
            testIter.reset();
        }
        log.info("完成训练和测试,耗时[{}]毫秒", System.currentTimeMillis()-startTime);

        // 保存模型
        File ministModelPath = new File(BASE_PATH + "/minist-model.zip");
        ModelSerializer.writeModel(net, ministModelPath, true);
        log.info("最新的MINIST模型保存在[{}]", ministModelPath.getPath());
    }
}
  • Executing the above code, the log output is as follows, the training and testing are successfully completed, and the accuracy rate reaches 0.9886:
21:19:15.355 [main] INFO org.deeplearning4j.optimize.listeners.ScoreIterationListener - Score at iteration 1110 is 0.18300625613640034
21:19:15.365 [main] DEBUG org.nd4j.linalg.dataset.AsyncDataSetIterator - Manually destroying ADSI workspace
21:19:16.632 [main] DEBUG org.nd4j.linalg.dataset.AsyncDataSetIterator - Manually destroying ADSI workspace
21:19:16.642 [main] INFO com.bolingcavalry.convolution.LeNetMNISTReLu - 

========================Evaluation Metrics========================
 # of classes:    10
 Accuracy:        0.9886
 Precision:       0.9885
 Recall:          0.9886
 F1 Score:        0.9885
Precision, recall & F1: macro-averaged (equally weighted avg. of 10 classes)


=========================Confusion Matrix=========================
    0    1    2    3    4    5    6    7    8    9
---------------------------------------------------
  972    0    0    0    0    0    2    2    2    2 | 0 = 0
    0 1126    0    3    0    2    1    1    2    0 | 1 = 1
    1    1 1019    2    0    0    0    6    3    0 | 2 = 2
    0    0    1 1002    0    5    0    1    1    0 | 3 = 3
    0    0    2    0  971    0    3    2    1    3 | 4 = 4
    0    0    0    3    0  886    2    1    0    0 | 5 = 5
    6    2    0    1    1    5  942    0    1    0 | 6 = 6
    0    1    6    0    0    0    0 1015    1    5 | 7 = 7
    1    0    1    1    0    2    0    2  962    5 | 8 = 8
    1    2    1    3    5    3    0    2    1  991 | 9 = 9

Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times
==================================================================
21:19:16.643 [main] INFO com.bolingcavalry.convolution.LeNetMNISTReLu - 完成训练和测试,耗时[27467]毫秒
21:19:17.019 [main] INFO com.bolingcavalry.convolution.LeNetMNISTReLu - 最新的MINIST模型保存在[E:\temp\202106\26\minist-model.zip]

Process finished with exit code 0

About accuracy

  • The previous test results show that the accuracy rate is <font color="blue">0.9886</font>, which is the training result of the <font color="red">1.0.0-beta6</font> version of DL4J. If you change it to <font color="red">1.0.0-beta7</font>, the accuracy rate can reach <font color="blue">0.99</font> or higher, you can try it;
  • At this point, the actual combat of the classic convolution under the DL4J framework has been completed. Up to now, our training and testing work is done by CPU. The increase in CPU usage during work is very obvious. In the next article, we will hand over today’s work. Give the GPU a try to see if you can use CUDA to accelerate the training and testing work;

You are not lonely, Xinchen is with you all the way

  1. Java series
  2. Spring series
  3. Docker series
  4. kubernetes series
  5. database + middleware series
  6. DevOps series

Welcome to pay attention to the public account: programmer Xin Chen

Search "Programmer Xin Chen" on WeChat, I am Xin Chen, and I look forward to traveling the Java world with you...
https://github.com/zq2599/blog_demos

程序员欣宸
147 声望24 粉丝

热爱Java和Docker