Implement CNN custom image classification based on Tensorflow + Opencv

Abstract: This article mainly implements CNN custom image classification cases through Tensorflow+Opencv. It can solve the image classification problems in our real papers or in practice, and compare experiments with machine learning image classification algorithms.

This article is shared from Huawei Cloud Community "Tensorflow+Opencv Implementation of CNN Custom Image Classification and Comparison with KNN Image Classification" , author: eastmount.

1. Image classification

Image Classification (Image Classification) is a problem of classifying image content. It uses a computer to quantitatively analyze the image and divide the image or the area in the image into several categories to replace human visual judgment. The traditional method of image classification is feature description and detection. This type of traditional method may be effective for some simple image classification, but because the actual situation is very complicated, the traditional classification method is overwhelmed. Nowadays, machine learning and deep learning methods are widely used to deal with image classification problems. The main task is to assign a bunch of input pictures to a certain label in a known mixed category.

In the figure below, the image classification model will obtain a single image and will have 4 labels {cat, dog, hat, mug}, corresponding to the probabilities {0.6, 0.3, 0.05, 0.05} respectively, where 0.6 represents the probability that the image label is a cat , The rest of the analogy. The image is represented as a three-dimensional array. In this example, the cat’s image has a width of 248 pixels, a height of 400 pixels, and has three color channels of red, green and blue (usually called RGB). Therefore, the image consists of 248×400×3 numbers or a total of 297,600 numbers, each of which is an integer from 0 (black) to 255 (white). The task of image classification is to turn these nearly 300,000 numbers into a single label, such as "cat".

So, how to write an image classification algorithm? How do you recognize cats from many images? The method adopted here is similar to that of educating children to look at pictures and recognize objects. A lot of image data is given to let the model continuously learn the characteristics of each class. Before training, you first need to classify and label the images of the training set, as shown in the figure, including four categories: cat, dog, mug, and hat. In actual engineering, there may be thousands of categories of objects, and each category will have millions of images.

Image classification is to input a bunch of image pixel value arrays, then assign a classification label to it, build an algorithm model through training and learning, and then use the model to perform image classification prediction. The specific process is as follows:

Input: The input contains a collection of N images, and the label of each image is one of the K classification labels. This collection is called the training set.
Learning: The second task is to use the training set to learn the characteristics of each class and build a training classifier or classification model.
Evaluation: Use the classifier to predict the classification label of the new input image, and use this to evaluate the quality of the classifier. The label predicted by the classifier is compared with the real classification label of the image to evaluate the quality of the classification algorithm. If the classification label predicted by the classifier is consistent with the real classification label of the image, the prediction is correct, otherwise the prediction is wrong.

Common classification algorithms include naive Bayes classifiers, decision trees, K nearest neighbor classification algorithms, support vector machines, neural networks, and rule-based classification algorithms. There are also ensemble learning algorithms for combining single class methods, such as Bagging And Boosting etc.

2. Image classification based on KNN algorithm

1.KNN algorithm

K-Nearest Neighbor Classifier (K-Nearest Neighbor Classifier) algorithm is an example-based classification method, and it is one of the simplest and most commonly used methods in data mining classification technology. The core idea of the algorithm is to find the first K samples (as similarity) of all training samples X that are closest to the test sample distance (Euclidean distance) from the training samples, and then select the K samples with the smallest distance from the sample to be classified as The K nearest neighbors of X, and check which type of sample most of these K samples belong to, then consider that the test sample category belongs to this type of sample.

Assuming that it is now necessary to determine whether the circular pattern in the figure below belongs to the triangle or square category, the KNN algorithm is used to analyze it as follows:

When K=3, the first circle in the figure contains three figures, two of which are triangles and one square. The circle is classified as a triangle.
When K=5, the second circle contains 5 figures, 2 triangles and 3 squares, and the circle is predicted as a square with a 3:2 voting result. Setting different K values may predict different results.

In short, a sample has the same category as most of the k nearest neighbors in the data set. It can be seen from its idea that KNN classifies by measuring the distance between different feature values, and when deciding the sample category, it only refers to the category of the k "neighbors" samples around the sample. Therefore, it is more suitable to deal with scenarios where there are many overlapping sample sets, and it is mainly used for predictive analysis, text classification, and dimensionality reduction.

KNN in the Sklearn machine learning package, the implementation of the class is neighbors.KNeighborsClassifier, referred to as KNN algorithm. The construction method is:

KNeighborsClassifier(algorithm='ball_tree', 
    leaf_size=30, 
    metric='minkowski',
    metric_params=None, 
    n_jobs=1, 
    n_neighbors=3, 
    p=2, 
    weights='uniform')

KNeighborsClassifier can set 3 algorithms: brute, kd_tree, ball_tree, and set the K value parameter to n_neighbors=3. The calling method is as follows:

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=3, algorithm=“ball_tree”)

It consists of two steps:

Training: nbrs.fit(data, target)
Forecast: pre = clf.predict(data)

2. Data set

This part mainly uses the Scikit-Learn package for Python image classification processing. The Scikit-Learn extension package is a classic and practical extension package for Python data mining and data analysis, usually abbreviated as Sklearn. The machine learning model in Scikit-Learn is very rich, including linear regression, decision tree, SVM, KMeans, KNN, PCA, etc. Users can choose the appropriate model of the extension package according to the type of specific analysis problem to perform data analysis , The installation process is mainly realized by "pip install scikit-learn".

The data set used in the experiment is the Sort_1000pics data set, which contains 1000 pictures, divided into 10 categories, namely people (category 0), beaches (category 1), buildings (category 2), Big truck (class 3), dinosaurs (class 4), elephants (class 5), flowers (class 6), horses (class 7), mountains (class 8) and food (class 9) , 100 sheets per category. as the picture shows.

Then divide all types of images into folders named "0" to "9" according to the corresponding category labels. As shown in the figure, each folder contains 100 images corresponding to the same category.

For example, the folder name "6" contains 100 images of flowers, as shown in the figure below.

3. KNN image classification

The following is the complete code to call the KNN algorithm for image classification. It divides 1000 images randomly according to the ratio of 70% in the training set and 30% in the test set, and then obtains the pixel histogram of each image, according to the feature distribution of pixels Perform image classification analysis. The core code of KNeighborsClassifier() is as follows:

* from sklearn.neighbors import KNeighborsClassifier

* clf = KNeighborsClassifier(n_neighbors=11).fit(XX_train, y_train)

* predictions_labels = clf.predict(XX_test)
完整代码及注释如下：

# -*- coding: utf-8 -*-
import os
import cv2
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.metrics import confusion_matrix, classification_report

#----------------------------------------------------------------------------------
# 第一步 切分训练集和测试集
#----------------------------------------------------------------------------------

X = [] #定义图像名称
Y = [] #定义图像分类类标
Z = [] #定义图像像素

for i in range(0, 10):
    #遍历文件夹，读取图片
    for f in os.listdir("photo/%s" % i):
        #获取图像名称
        X.append("photo//" +str(i) + "//" + str(f))
        #获取图像类标即为文件夹名称
        Y.append(i)

X = np.array(X)
Y = np.array(Y)

#随机率为100% 选取其中的30%作为测试集
X_train, X_test, y_train, y_test = train_test_split(X, Y,
                                                    test_size=0.3, random_state=1)

print len(X_train), len(X_test), len(y_train), len(y_test)

#----------------------------------------------------------------------------------
# 第二步 图像读取及转换为像素直方图
#----------------------------------------------------------------------------------

#训练集
XX_train = []
for i in X_train:
    #读取图像
    #print i
    image = cv2.imread(i)
    
    #图像像素大小一致
    img = cv2.resize(image, (256,256),
                     interpolation=cv2.INTER_CUBIC)

    #计算图像直方图并存储至X数组
    hist = cv2.calcHist([img], [0,1], None,
                            [256,256], [0.0,255.0,0.0,255.0])

    XX_train.append(((hist/255).flatten()))

#测试集
XX_test = []
for i in X_test:
    #读取图像
    #print i
    image = cv2.imread(i)
    
    #图像像素大小一致
    img = cv2.resize(image, (256,256),
                     interpolation=cv2.INTER_CUBIC)

    #计算图像直方图并存储至X数组
    hist = cv2.calcHist([img], [0,1], None,
                            [256,256], [0.0,255.0,0.0,255.0])

    XX_test.append(((hist/255).flatten()))

#----------------------------------------------------------------------------------
# 第三步 基于KNN的图像分类处理
#----------------------------------------------------------------------------------

from sklearn.neighbors import KNeighborsClassifier
clf = KNeighborsClassifier(n_neighbors=11).fit(XX_train, y_train)
predictions_labels = clf.predict(XX_test)

print u'预测结果:'
print predictions_labels

print u'算法评价:'
print (classification_report(y_test, predictions_labels))

#输出前10张图片及预测结果
k = 0
while k<10:
    #读取图像
    print X_test[k]
    image = cv2.imread(X_test[k])
    print predictions_labels[k]
    #显示图像
    cv2.imshow("img", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    k = k + 1

The code displays the first ten images of the prediction set, where the "818.jpg" image is shown in the figure, and the classification prediction result is "8", which means the 8th type of mountain, and the prediction result is correct.

The following figure shows the "452.jpg" image. The classification prediction result of the classification is "4", which means that the fourth type of dinosaur is correct.

The following figure shows the "507.jpg" image. The classification prediction result is "7", which is incorrectly predicted to be the 7th type of dinosaur. The real result should be the 5th type of elephant.

Use KNN algorithm to carry out image classification experiment. The accuracy rate (Precision), recall rate (Recall) and F value (F1-score) of the final algorithm evaluation are shown in the figure. The average accuracy rate is 0.64, the average recall rate is 0.55, and the average The F value is 0.50, and the result is not very satisfactory. So, if CNN convolutional neural network is used for classification, can the accuracy be improved through continuous learning of details?

3. Tensorflow+Opencv realizes CNN image classification

First of all, we need to install the OpenCV extension package in the TensorFlow environment; secondly, we need to build a CNN neural network through the TensorFlow environment; finally, we need to implement image classification experiments through continuous learning.

1. OpenCV library installation

The first step is to open the Anaconda program and select the installed "TensorFlow" environment to run Spyder.

In the second step, we need to install the opencv-python extension package in the TensorFlow environment, otherwise the error "ModuleNotFoundError: No module named'cv2'" will be prompted. Just call Anaconda Prompt to install, as shown in the following figure:

activate tensorflow
pip install opencv-python

The installation is successful as shown in the figure below.

However, because the anaconda.org server is abroad, the download speed is very slow, and the error "Anaconda An HTTP error occurred when trying to retrieve this URL. HTTP errors are often intermittent" is displayed.

Solution 1: Download from the domestic mirror image of Tsinghua University
Solution 2: Download the corresponding version of opencv-python from the PYPI website, and then install the locally downloaded .whl file. Download : 16153e4cb0cd91 https://www.lfd.uci.edu/~gohlke/pythonlibs/#OpenCV
Since the first method has been failing, it is recommended that readers try the second method. At the same time, the author will upload the "opencv_python-4.1.2-cp36-cp36m-win_amd64.whl" file for your direct use. (4.1.2 represents the version of opencv, cp36 represents the python3.6 used, and is 64-bit).

The third step is to call PIP to install the local opencv extension package.

activate tensorflow
pip install C:\Users\xiuzhang\Desktop\TensorFlow\opencv_python-4.1.2-cp36-cp36m-win_amd64.whl

This method is very fast and I recommend everyone to use it. After the installation is successful, start writing our code!

2. Read the folder image

The specific steps of this part are as follows:

Define the function read_img() to read the images from "0" to "9" in the folder "photo"
Call the cv2.imread() function to get all the pixel values of each picture, and pass
cv2.resize() is uniformly modified to 32*32 size
Obtain image pixels, image category labels and image path names in sequence: fpaths, data, label = read_img(path)
Randomly adjust the order of the images, and divide the data set according to a ratio of 2-8, where 80% of the data is used for training and 20% of the data is used for testing

#---------------------------------第一步 读取图像-----------------------------------
def read_img(path):
    cate = [path + x for x in os.listdir(path) if os.path.isdir(path + x)]
    imgs = []
    labels = []
    fpath = []
    for idx, folder in enumerate(cate):
        # 遍历整个目录判断每个文件是不是符合
        for im in glob.glob(folder + '/*.jpg'):
            #print('reading the images:%s' % (im))
            img = cv2.imread(im)             #调用opencv库读取像素点
            img = cv2.resize(img, (32, 32))  #图像像素大小一致
            imgs.append(img)                 #图像数据
            labels.append(idx)               #图像类标
            fpath.append(path+im)            #图像路径名
            #print(path+im, idx)
            
    return np.asarray(fpath, np.string_), np.asarray(imgs, np.float32), np.asarray(labels, np.int32)

# 读取图像
fpaths, data, label = read_img(path)
print(data.shape)  # (1000, 256, 256, 3)
# 计算有多少类图片
num_classes = len(set(label))
print(num_classes)

# 生成等差数列随机调整图像顺序
num_example = data.shape[0]
arr = np.arange(num_example)
np.random.shuffle(arr)
data = data[arr]
label = label[arr]
fpaths = fpaths[arr]

# 拆分训练集和测试集 80%训练集 20%测试集
ratio = 0.8
s = np.int(num_example * ratio)
x_train = data[:s]
y_train = label[:s]
fpaths_train = fpaths[:s] 
x_val = data[s:]
y_val = label[s:]
fpaths_test = fpaths[s:] 
print(len(x_train),len(y_train),len(x_val),len(y_val)) #800 800 200 200
print(y_val)

3. Build CNN

The specific steps of this part are as follows:

First define Placeholder, which is used to input the input value, xs represents the picture 32 32 pixels, and contains three RGB layers, so the size is set to 32 32 * 3; ys represents the final predicted class value of each picture.
Call the tf.layers.conv2d() function to define the convolutional layer, including 20 convolution kernels, the size of the convolution kernel is 5, and the excitation function is Relu; call the tf.layers.max_pooling2d() function to define the pooling process, the step size is 2. Shrink doubled.
Then define the second convolutional layer and pooling layer, there are now a total of conv0, pool0 and conv1, pool1.
The fully connected layer is defined by the tf.layers.dense() function, converted into a feature vector of length 400, and DropOut is added to prevent overfitting.
The output layer is logits, including 10 numbers, and the final prediction result is predicted_labels, which is tf.arg_max(logits, 1).

#---------------------------------第二步 建立神经网络-----------------------------------
# 定义Placeholder
xs = tf.placeholder(tf.float32, [None, 32, 32, 3])  #每张图片32*32*3个点
ys = tf.placeholder(tf.int32, [None])               #每个样本有1个输出
# 存放DropOut参数的容器 
drop = tf.placeholder(tf.float32)                   #训练时为0.25 测试时为0

# 定义卷积层 conv0
conv0 = tf.layers.conv2d(xs, 20, 5, activation=tf.nn.relu)    #20个卷积核 卷积核大小为5 Relu激活
# 定义max-pooling层 pool0
pool0 = tf.layers.max_pooling2d(conv0, [2, 2], [2, 2])        #pooling窗口为2x2 步长为2x2
print("Layer0：\n", conv0, pool0)
 
# 定义卷积层 conv1
conv1 = tf.layers.conv2d(pool0, 40, 4, activation=tf.nn.relu) #40个卷积核 卷积核大小为4 Relu激活
# 定义max-pooling层 pool1
pool1 = tf.layers.max_pooling2d(conv1, [2, 2], [2, 2])        #pooling窗口为2x2 步长为2x2
print("Layer1：\n", conv1, pool1)

# 将3维特征转换为1维向量
flatten = tf.layers.flatten(pool1)

# 全连接层 转换为长度为400的特征向量
fc = tf.layers.dense(flatten, 400, activation=tf.nn.relu)
print("Layer2：\n", fc)

# 加上DropOut防止过拟合
dropout_fc = tf.layers.dropout(fc, drop)

# 未激活的输出层
logits = tf.layers.dense(dropout_fc, num_classes)
print("Output：\n", logits)

# 定义输出结果
predicted_labels = tf.arg_max(logits, 1)

4. Define the loss function and optimizer

Use cross entropy to define the loss, and use the AdamOptimizer optimizer for deep learning. The core code is as follows.

One-hot type data is also known as one-bit effective encoding. It mainly uses N-bit status registers to encode N states. Each state has its own independent register bit, and only one bit is valid at any time. For example, [0 0 0 1 0 0 0 0 0 0...] means "animal".

# 利用交叉熵定义损失
losses = tf.nn.softmax_cross_entropy_with_logits(
        labels = tf.one_hot(ys, num_classes),       #将input转化为one-hot类型数据输出
        logits = logits)

# 平均损失
mean_loss = tf.reduce_mean(losses)

# 定义优化器 学习效率设置为0.0001
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(losses)

5. Model training and prediction

Define the label variable train. When it is True, perform the training operation and save the training model; when it is False, perform the prediction, and perform the image classification prediction experiment on the 20% prediction set.

#------------------------------------第四步 模型训练和预测-----------------------------------
# 用于保存和载入模型
saver = tf.train.Saver()
# 训练或预测
train = False
# 模型文件路径
model_path = "model/image_model"

with tf.Session() as sess:
    if train:
        print("训练模式")
        # 训练初始化参数
        sess.run(tf.global_variables_initializer())
        # 定义输入和Label以填充容器 训练时dropout为0.25
        train_feed_dict = {
                xs: x_train,
                ys: y_train,
                drop: 0.25
        }
        # 训练学习1000次
        for step in range(1000):
            _, mean_loss_val = sess.run([optimizer, mean_loss], feed_dict=train_feed_dict)
            if step % 50 == 0:  #每隔50次输出一次结果
                print("step = {}\t mean loss = {}".format(step, mean_loss_val))
        # 保存模型
        saver.save(sess, model_path)
        print("训练结束，保存模型到{}".format(model_path))
    else:
        print("测试模式")
        # 测试载入参数
        saver.restore(sess, model_path)
        print("从{}载入模型".format(model_path))
        # label和名称的对照关系
        label_name_dict = {
            0: "人类",
            1: "沙滩",
            2: "建筑",
            3: "公交",
            4: "恐龙",
            5: "大象",
            6: "花朵",
            7: "野马",
            8: "雪山",
            9: "美食"
        }
        # 定义输入和Label以填充容器 测试时dropout为0
        test_feed_dict = {
            xs: x_val,
            ys: y_val,
            drop: 0
        }
        
        # 真实label与模型预测label
        predicted_labels_val = sess.run(predicted_labels, feed_dict=test_feed_dict)
        for fpath, real_label, predicted_label in zip(fpaths_test, y_val, predicted_labels_val):
            # 将label id转换为label名
            real_label_name = label_name_dict[real_label]
            predicted_label_name = label_name_dict[predicted_label]
            print("{}\t{} => {}".format(fpath, real_label_name, predicted_label_name))
        # 评价结果
        print("正确预测个数:", sum(y_val==predicted_labels_val))
        print("准确度为:", 1.0*sum(y_val==predicted_labels_val) / len(y_val))

6. Complete code and experimental results

The complete code is shown below, here is part of the code of teacher Wang Shiye, I strongly recommend everyone to learn from his blog. Address: https://blog.csdn.net/wills798

"""
Created on Sun Dec 29 19:21:08 2019
@author: xiuzhang Eastmount CSDN
"""
import os
import glob
import cv2
import numpy as np
import tensorflow as tf

# 定义图片路径
path = 'photo/'

#---------------------------------第一步 读取图像-----------------------------------
def read_img(path):
    cate = [path + x for x in os.listdir(path) if os.path.isdir(path + x)]
    imgs = []
    labels = []
    fpath = []
    for idx, folder in enumerate(cate):
        # 遍历整个目录判断每个文件是不是符合
        for im in glob.glob(folder + '/*.jpg'):
            #print('reading the images:%s' % (im))
            img = cv2.imread(im)             #调用opencv库读取像素点
            img = cv2.resize(img, (32, 32))  #图像像素大小一致
            imgs.append(img)                 #图像数据
            labels.append(idx)               #图像类标
            fpath.append(path+im)            #图像路径名
            #print(path+im, idx)
            
    return np.asarray(fpath, np.string_), np.asarray(imgs, np.float32), np.asarray(labels, np.int32)

# 读取图像
fpaths, data, label = read_img(path)
print(data.shape)  # (1000, 256, 256, 3)
# 计算有多少类图片
num_classes = len(set(label))
print(num_classes)

# 生成等差数列随机调整图像顺序
num_example = data.shape[0]
arr = np.arange(num_example)
np.random.shuffle(arr)
data = data[arr]
label = label[arr]
fpaths = fpaths[arr]

# 拆分训练集和测试集 80%训练集 20%测试集
ratio = 0.8
s = np.int(num_example * ratio)
x_train = data[:s]
y_train = label[:s]
fpaths_train = fpaths[:s] 
x_val = data[s:]
y_val = label[s:]
fpaths_test = fpaths[s:] 
print(len(x_train),len(y_train),len(x_val),len(y_val)) #800 800 200 200
print(y_val)
#---------------------------------第二步 建立神经网络-----------------------------------
# 定义Placeholder
xs = tf.placeholder(tf.float32, [None, 32, 32, 3])  #每张图片32*32*3个点
ys = tf.placeholder(tf.int32, [None])               #每个样本有1个输出
# 存放DropOut参数的容器 
drop = tf.placeholder(tf.float32)                   #训练时为0.25 测试时为0

# 定义卷积层 conv0
conv0 = tf.layers.conv2d(xs, 20, 5, activation=tf.nn.relu)    #20个卷积核 卷积核大小为5 Relu激活
# 定义max-pooling层 pool0
pool0 = tf.layers.max_pooling2d(conv0, [2, 2], [2, 2])        #pooling窗口为2x2 步长为2x2
print("Layer0：\n", conv0, pool0)
 
# 定义卷积层 conv1
conv1 = tf.layers.conv2d(pool0, 40, 4, activation=tf.nn.relu) #40个卷积核 卷积核大小为4 Relu激活
# 定义max-pooling层 pool1
pool1 = tf.layers.max_pooling2d(conv1, [2, 2], [2, 2])        #pooling窗口为2x2 步长为2x2
print("Layer1：\n", conv1, pool1)

# 将3维特征转换为1维向量
flatten = tf.layers.flatten(pool1)

# 全连接层 转换为长度为400的特征向量
fc = tf.layers.dense(flatten, 400, activation=tf.nn.relu)
print("Layer2：\n", fc)

# 加上DropOut防止过拟合
dropout_fc = tf.layers.dropout(fc, drop)

# 未激活的输出层
logits = tf.layers.dense(dropout_fc, num_classes)
print("Output：\n", logits)

# 定义输出结果
predicted_labels = tf.arg_max(logits, 1)
#---------------------------------第三步 定义损失函数和优化器---------------------------------

# 利用交叉熵定义损失
losses = tf.nn.softmax_cross_entropy_with_logits(
        labels = tf.one_hot(ys, num_classes),       #将input转化为one-hot类型数据输出
        logits = logits)

# 平均损失
mean_loss = tf.reduce_mean(losses)

# 定义优化器 学习效率设置为0.0001
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(losses)
#------------------------------------第四步 模型训练和预测-----------------------------------
# 用于保存和载入模型
saver = tf.train.Saver()
# 训练或预测
train = False
# 模型文件路径
model_path = "model/image_model"

with tf.Session() as sess:
    if train:
        print("训练模式")
        # 训练初始化参数
        sess.run(tf.global_variables_initializer())
        # 定义输入和Label以填充容器 训练时dropout为0.25
        train_feed_dict = {
                xs: x_train,
                ys: y_train,
                drop: 0.25
        }
        # 训练学习1000次
        for step in range(1000):
            _, mean_loss_val = sess.run([optimizer, mean_loss], feed_dict=train_feed_dict)
            if step % 50 == 0:  #每隔50次输出一次结果
                print("step = {}\t mean loss = {}".format(step, mean_loss_val))
        # 保存模型
        saver.save(sess, model_path)
        print("训练结束，保存模型到{}".format(model_path))
    else:
        print("测试模式")
        # 测试载入参数
        saver.restore(sess, model_path)
        print("从{}载入模型".format(model_path))
        # label和名称的对照关系
        label_name_dict = {
            0: "人类",
            1: "沙滩",
            2: "建筑",
            3: "公交",
            4: "恐龙",
            5: "大象",
            6: "花朵",
            7: "野马",
            8: "雪山",
            9: "美食"
        }
        # 定义输入和Label以填充容器 测试时dropout为0
        test_feed_dict = {
            xs: x_val,
            ys: y_val,
            drop: 0
        }
        
        # 真实label与模型预测label
        predicted_labels_val = sess.run(predicted_labels, feed_dict=test_feed_dict)
        for fpath, real_label, predicted_label in zip(fpaths_test, y_val, predicted_labels_val):
            # 将label id转换为label名
            real_label_name = label_name_dict[real_label]
            predicted_label_name = label_name_dict[predicted_label]
            print("{}\t{} => {}".format(fpath, real_label_name, predicted_label_name))
        # 评价结果
        print("正确预测个数:", sum(y_val==predicted_labels_val))
        print("准确度为:", 1.0*sum(y_val==predicted_labels_val) / len(y_val))

The training output results are as follows:

(1000, 32, 32, 3)
10
800 800 200 200
[2 8 6 9 9 5 2 2 9 3 7 0 6 0 0 1 3 2 7 3 4 6 9 5 8 6 4 1 1 4 4 8 6 2 6 1 2
 5 0 7 9 5 2 4 6 8 7 5 8 1 6 5 1 4 8 1 9 1 8 8 6 1 0 5 3 3 1 2 9 1 8 7 6 0
 8 1 8 0 2 1 3 5 3 6 9 8 7 5 2 5 2 8 8 8 4 2 2 4 3 5 3 3 9 1 1 5 2 6 7 6 7
 0 7 4 1 7 2 9 4 0 3 8 7 5 3 8 1 9 3 6 8 0 0 1 7 7 9 5 4 0 3 0 4 5 7 2 2 3
 0 8 2 0 2 3 5 1 7 2 1 6 5 8 1 4 6 6 8 6 5 5 1 7 2 8 7 1 3 9 7 1 3 6 0 8 7
 5 8 0 1 2 7 9 6 2 4 7 7 2 8 0]

Layer0： 
Tensor("conv2d_1/Relu:0", shape=(?, 28, 28, 20), dtype=float32) 
Tensor("max_pooling2d_1/MaxPool:0", shape=(?, 14, 14, 20), dtype=float32)
Layer1： 
Tensor("conv2d_2/Relu:0", shape=(?, 11, 11, 40), dtype=float32) 
Tensor("max_pooling2d_2/MaxPool:0", shape=(?, 5, 5, 40), dtype=float32)
Layer2：
 Tensor("dense_1/Relu:0", shape=(?, 400), dtype=float32)
Output：
 Tensor("dense_2/BiasAdd:0", shape=(?, 10), dtype=float32)

训练模式
step = 0         mean loss = 66.93688201904297
step = 50        mean loss = 3.376957654953003
step = 100       mean loss = 0.5910811424255371
step = 150       mean loss = 0.061084795743227005
step = 200       mean loss = 0.013018212281167507
step = 250       mean loss = 0.006795921362936497
step = 300       mean loss = 0.004505819175392389
step = 350       mean loss = 0.0032660639844834805
step = 400       mean loss = 0.0024683878291398287
step = 450       mean loss = 0.0019308131886646152
step = 500       mean loss = 0.001541870180517435
step = 550       mean loss = 0.0012695763725787401
step = 600       mean loss = 0.0010685999877750874
step = 650       mean loss = 0.0009132082923315465
step = 700       mean loss = 0.0007910516578704119
step = 750       mean loss = 0.0006900889566168189
step = 800       mean loss = 0.0006068988586775959
step = 850       mean loss = 0.0005381597438827157
step = 900       mean loss = 0.0004809059901162982
step = 950       mean loss = 0.0004320790758356452
训练结束，保存模型到model/image_model

The prediction output result is shown in the figure below. The final prediction is correct 181 pictures with an accuracy of 0.905. Compared with the 0.500 of the previous machine learning KNN, it has a very high improvement.

测试模式
INFO:tensorflow:Restoring parameters from model/image_model
从model/image_model载入模型
b'photo/photo/3\\335.jpg'       公交 => 公交
b'photo/photo/1\\129.jpg'       沙滩 => 沙滩
b'photo/photo/7\\740.jpg'       野马 => 野马
b'photo/photo/5\\564.jpg'       大象 => 大象
...
b'photo/photo/9\\974.jpg'       美食 => 美食
b'photo/photo/2\\220.jpg'       建筑 => 公交
b'photo/photo/9\\912.jpg'       美食 => 美食
b'photo/photo/4\\459.jpg'       恐龙 => 恐龙
b'photo/photo/5\\525.jpg'       大象 => 大象
b'photo/photo/0\\44.jpg'        人类 => 人类

正确预测个数: 181
准确度为: 0.905

Four. Summary

At this point, this article is explained. More TensorFlow deep learning articles will continue to be shared. At the same time, experimental evaluation, RNN, LSTM, and professional cases will be explained in depth. Finally, I hope this basic article is helpful to you. If there are errors or shortcomings in the article, please Haihan~ As a rookie of artificial intelligence, I hope I can continue to improve and in-depth, and then apply it to image recognition , Network security, adversarial samples and other fields, guide everyone to write simple academic papers, come on!

References:

[1] Gonzalez. Digital Image Processing (3rd Edition) [M]. Beijing: Publishing House of Electronics Industry, 2013.
[2] Yang Xiuzhang, Yan Na. Python network data crawling and analysis from entry to proficiency (Analysis) [M]. Beijing: Beijing University of Aeronautics and Astronautics Press, 2018.
[3] Luo Zijiang et al. Image processing in Python[M]. Science Press, 2020.
[4] [python data mining course] 20. KNN nearest neighbor classification algorithm analysis detailed and balanced scale TXT data set reading
[5] TensorFlow [Simplified] CNN-Yellow_python Great God
[6] Develop sound source localization with phase information based on the directional activation function of deep neural network-Zhang Ziju Kevin
[7] TensorFlow combat: Chapter-5 (CNN-3-classical convolutional neural network (GoogleNet))-DFann
[8] https://github.com/siucaan/CNN_MNIST
[9] Image processing explanation-using CNN to classify images as an example-Bing Ji Ling
[10] Image defect classification based on CNN-BellaVita1
[12] tensorflow (6) training and classifying your own pictures (CNN super detailed entry version)-Missayaa
[13] Detailed explanation of tensorflow training its own data set to achieve CNN image classification-Wang Shiye (strong push)
[14] https://github.com/hujunxianligong/Tensorflow-CNN-Tutorial
[15] tensorflow (three): image classification with CNN-flowrush
[16] TensorFlow image recognition (object classification) introductory tutorial-cococok2
[17] https://github.com/calssion/Fun_AI
[18] CNN Picture Classification-Fire Dance_流沙
[19] CNN image single-label classification (based on TensorFlow to implement the basic VGG16 network)
[20] https://github.com/siucaan/CNN_MNIST/blob/master/cnn_mnist_TF.py
[21] Tensorflow implements CNN for MNIST recognition-siucaan (strong push)
[22] Use Anaconda3 to install tensorflow, opencv, so that it can run in spyder

Click to follow and learn about Huawei Cloud's fresh technology for the first time~

Implement CNN custom image classification based on Tensorflow + Opencv

1. Image classification

2. Image classification based on KNN algorithm

1.KNN algorithm

2. Data set

3. KNN image classification

3. Tensorflow+Opencv realizes CNN image classification

1. OpenCV library installation

2. Read the folder image

3. Build CNN

4. Define the loss function and optimizer

5. Model training and prediction

6. Complete code and experimental results

Four. Summary

华为云开发者联盟

引用和评论

华为云开发者联盟入选 2023 中国技术品牌影响力企业榜，深耕开发者生态

Elasticsearch 8.x 重要变化（qbit）

文-图生视频双发力，Wan 2.1 高质量视频生成教程

OpenBayes 教程上新丨字节开源 InfiniteYou 图像生成框架，实现高保真面部特征迁移

书籍-《使用TensorFlow和Keras的神经网络》

单卡 4090 即可启动，一键部署 QwQ-32B-AWQ 教程

OpenBayes 教程上新丨CSM 驾到，统统闪开！更鲜活的语音生成，从此告别延迟呆板机械味