TensorFlow - 一次读取 TFRecords 中的所有示例？

Question

新手上路，请多包涵

您如何一次从 TFRecords 中读取所有示例？

我一直在使用 tf.parse_single_example 读出个别示例，使用的代码类似于 fully_connected_reader 示例中方法 read_and_decode 中给出的代码。但是，我想一次针对我的整个验证数据集运行网络，因此我想将它们全部加载。

我不完全确定，但文档似乎建议我可以使用 tf.parse_example 而不是 tf.parse_single_example 一次加载整个 TFRecords 文件。我似乎无法让它工作。我猜这与我如何指定功能有关，但我不确定功能规范中如何说明有多个示例。

换句话说，我尝试使用类似于：

 reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_example(serialized_example, features={
    'image_raw': tf.FixedLenFeature([], tf.string),
    'label': tf.FixedLenFeature([], tf.int64),
})

不起作用，我认为这是因为这些功能不希望同时出现多个示例（但同样，我不确定）。 [这导致错误 ValueError: Shape () must have rank 1 ]

这是一次读取所有记录的正确方法吗？如果是这样，我需要更改什么才能真正读取记录？非常感谢！

原文由 golmschenk 发布，翻译遵循 CC BY-SA 4.0 许可协议

python tensorflow tfrecord

阅读 330

1 个回答

得票最新

社区维基

1

发布于
2023-01-08

为了清楚起见，我在一个 .tfrecords 文件中有几千张图像，它们是 720 x 720 rgb png 文件。标签是 0,1,2,3 之一。

我也尝试过使用 parse_example 但无法使其工作，但此解决方案适用于 parse_single_example。

不利的是，现在我必须知道每个 .tf 记录中有多少项，这有点麻烦。如果我找到更好的方法，我会更新答案。另外，要小心超出 .tfrecords 文件中记录数的范围，如果你循环经过最后一条记录，它将从第一条记录重新开始

诀窍是让队列运行器使用协调器。

我在这里留下了一些代码来保存正在读取的图像，以便您可以验证图像是否正确。

 from PIL import Image
import numpy as np
import tensorflow as tf

def read_and_decode(filename_queue):
 reader = tf.TFRecordReader()
 _, serialized_example = reader.read(filename_queue)
 features = tf.parse_single_example(
  serialized_example,
  # Defaults are not specified since both keys are required.
  features={
      'image_raw': tf.FixedLenFeature([], tf.string),
      'label': tf.FixedLenFeature([], tf.int64),
      'height': tf.FixedLenFeature([], tf.int64),
      'width': tf.FixedLenFeature([], tf.int64),
      'depth': tf.FixedLenFeature([], tf.int64)
  })
 image = tf.decode_raw(features['image_raw'], tf.uint8)
 label = tf.cast(features['label'], tf.int32)
 height = tf.cast(features['height'], tf.int32)
 width = tf.cast(features['width'], tf.int32)
 depth = tf.cast(features['depth'], tf.int32)
 return image, label, height, width, depth

def get_all_records(FILE):
 with tf.Session() as sess:
   filename_queue = tf.train.string_input_producer([ FILE ])
   image, label, height, width, depth = read_and_decode(filename_queue)
   image = tf.reshape(image, tf.pack([height, width, 3]))
   image.set_shape([720,720,3])
   init_op = tf.initialize_all_variables()
   sess.run(init_op)
   coord = tf.train.Coordinator()
   threads = tf.train.start_queue_runners(coord=coord)
   for i in range(2053):
     example, l = sess.run([image, label])
     img = Image.fromarray(example, 'RGB')
     img.save( "output/" + str(i) + '-train.png')

     print (example,l)
   coord.request_stop()
   coord.join(threads)

get_all_records('/path/to/train-0.tfrecords')

原文由 Andrew Pierno 发布，翻译遵循 CC BY-SA 3.0 许可协议

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

Stack Overflow 翻译

子站问答

访问

本篇内容翻译自 Stack Overflow，如果你觉得翻译结果值得改进，欢迎直接编辑修改，感谢你为社区贡献。

相似问题

找不到问题？创建新问题

TensorFlow - 一次读取 TFRecords 中的所有示例？

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

分解质因素的算法很难，理解不了。请问有哪位大佬可以进行解释一下呢？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Stack Overflow 翻译

TensorFlow - 一次读取 TFRecords 中的所有示例？

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

分解质因素的算法很难，理解不了。 请问有哪位大佬可以进行解释一下呢？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Stack Overflow 翻译

分解质因素的算法很难，理解不了。请问有哪位大佬可以进行解释一下呢？