Tensorflow使用inception_resnet_v2预训练网络分类出现一些问题

Lippp
  • 7

1问题描述

这次实战是利用slim框架里面的代码,想利用inception_resnet_v2的预训练网络去训练自己的数据集进行分类。但是出现了

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [764] rhs shape= [1001]
     [[Node: save/Assign_8 = Assign[T=DT_FLOAT, _class=["loc:@InceptionResnetV2/AuxLogits/Logits/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](InceptionResnetV2/AuxLogits/Logits/biases, save/RestoreV2_8)]]

这种问题大概就是说我把1001element放入到element只有764里面导致报错

2

自己通过搜索,发现也有人遇到相似问题,删掉之前训练过的checkpoint数据就可以。可是我是在tinymind计算(相当于云计算)的,应该不存在有先前训练过留下的数据。
自己尝试了改了下slim框架代码也没成功(可能没改对)。

相关代码

// 请把代码文本粘贴到下方(请勿用图片代替代码)

Caused by op 'save/Assign_8', defined at:
  File "./train_image_classifier.py", line 581, in 
    tf.app.run()
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
    _sys.exit(main(argv))
  File "./train_image_classifier.py", line 571, in main
    init_fn=_get_init_fn(),
  File "./train_image_classifier.py", line 369, in _get_init_fn
    ignore_missing_vars=FLAGS.ignore_missing_vars)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 688, in assign_from_checkpoint_fn
    saver = tf_saver.Saver(var_list, reshape=reshape_variables)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
    self.build()
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1248, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
    build_save=build_save, build_restore=build_restore)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
    restore_sequentially, reshape)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 440, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 160, in restore
    self.op.get_shape().is_fully_defined())
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
    validate_shape=validate_shape)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
    use_locking=use_locking, name=name)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

3

有没有大神遇到过这种情况(不是旧checkpoint数据导致的),小弟在此谢谢了。
slim框架
https://github.com/tensorflow...

回复
阅读 3.7k
2 个回答

还是没搞定。感觉要放弃这个模型

除夕之月
  • 2
新手上路,请多包涵

我删掉train_dir里边的checkpoint之后就可以训练了。我是用的inceptionv3

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
你知道吗?

宣传栏