tensorflow 读取csv文件

1.创建文件列表

path='./data/csvdata/'
file_names=os.listdir(path)
file_list=[os.path.join(path,file_name)for file_name in file_names]

2.读文件文件列表到文件队列

file_queue=tf.train.string_input_producer(file_list)

3.构建文件阅读器读取队列文件

reader=tf.TextLineReader()
key,value=reader.read(file_queue)

4.解码

recodes=[['None'],['None']]
example,label=tf.decoder_csv(value,record_defaults=records)

5.批处理

tf.train.batch([example,label],batch_size=9,num_threads=2,capacity=100)

6.开启tf会话多线程进行处理

coord=tf.train.Coordinator()
threads=tf.start_queue_runners(sess,coord=coord)
coord.request_stop()
coord.join(threads)

完整代码

import tensorflow as  tf
import os

'''
 1.构建文件队列
 2.读取队列内容,,默认读取一个样本
    1.csv文件,读取一行
    2.二进制文件,指定一个样本的bytes读取
    3.图片文件,默认读取一张一张读取
 3.解码
 4.批处理读取文件
 5.主线程取样本数据训练
'''


def csv_read(file_list):
    # 1.构造文件队列
    file_queue = tf.train.string_input_producer(file_list)
    # 2.构造阅读器,读取文件
    reader = tf.TextLineReader()
    key, value = reader.read(file_queue)
    # 3.进行文件解码
    record = [["None"], ["None"]]
    example, label = tf.decode_csv(value, record_defaults=record)
    # 4.批处理
    batch_example, batch_label = tf.train.batch([example, label], batch_size=9, num_threads=1, capacity=9)
    return batch_example, batch_label


if __name__ == '__main__':
    file_names = os.listdir("./data/csvdata")

    file_list = [os.path.join('./data/csvdata', file) for file in file_names]
    # print(file_list)
    example, label = csv_read(file_list)
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess, coord=coord)
        print(sess.run([example, label]))
        coord.request_stop()
        coord.join(threads)

tensorflow 读取csv文件

捕风

引用和评论

推荐系统评测指标

如何减少跨团队交付摩擦？——基于 DevOps 与敏捷的最佳实践

Anaconda安装教程以及Anaconda和pip配置国内镜像

科学计算编程涉及到的技术栈简介

使用 chardet 判断文件编码需要注意的坑——过大的文件会导致高耗时

Python3 格式化时间（qbit）

本地使用PaddleOCR进行图片识别获得文字（返回JSON）