Tensorflow-1.x源码编译及C++API调用

网上很多人分享过Tensorflow C++环境配置的方法，我仅记录一些我在C++应用程序调用TensorFlow API的使用中踩过的坑。

一、环境及版本
系统环境：docker，系统CentOS7；宿主机是公司的服务器，GPU8个，nvidia A系列，具体版本型号通过nvidia-smi看，不公开记录了。
TensorFlow版本：1.13.2。
Python：2.7。
JDK：8。

二、编译TensorFlow源码的环境准备
1、安装bazel
这里存在版本对应问题，TensorFlow官网给出了各版本与bazel版本之间的最优对应关系，下载对应的bazel安装脚本运行就完了，最后添加个环境变量在~/.bashrc里：
export PTAH=/usr/local/bin/:PATH

2、安装protobuf
编译一下TensorFlow，找不到各种头文件，光装完bazel就想编译TensorFlow显然不行，还得安装protobuf和Eigen。
protobuf好像没有很限定版本，我就装了个3.6.1。官网下载源码安装包，然后

#./autogen.sh
#./configure --prefix=/usr/local/bin  #可指定安装位置
#make
#make check
#make install

make check的时候我遇到过错误，但是看网上说不用理它也没事……
因为TensorFlow编译的时候会用到一些bazel-genfiles里的pb文件，这些pb文件由probobuf产生，如果生成这些pb文件的protobuf和编译时候用的版本不一致也不行，那些引用的pb.h文件里有写它的版本限制。

3、安装Eigen
Eigen目前也没发现有版本限制，所以直接安装了最新的3.3.9。还是官网下载源码安装包，解压后进入该目录，然后

#mkdir build
#cd build
#cmake ..
#make install

INSTALL文件里有安装说明。
protobuf和eigen其实通过tensorflow/contrib/makefile/下的build_all_linux.sh可以直接安装，但是通常网不行下不来。它里面又调用了download_dependencies.sh，但是这个脚本有问题……上网搜了一下，根据别人提供的方法改了解压缩的部分：

if [[ "${url}" == "*gz"]]; then             #少了""
    ···
elif [[ "${url}" == "*zip"]]; then          #少了""
    ···

还有最后的一些replace_by_sed调用，看上去像是为了一些其他架构的平台用的，把它也注释掉了。后来倒是能运行了但是网不好下不动，也没试到底行不行，仅供参考。

4、安装cuda和cudnn
要用GPU编译TensorFlow的时候就需要加上cuda支持选项，显然要装cuda。
我装的10.0版本，不过对于我用的硬件环境来说官方推荐的版本是11.0了，但是似乎都可以用，没那么大的影响，所以我猜11.0也行。
在线安装太费劲了，还是找了以前的离线安装文件，cuda_10.0.130_410.48_linux.run运行，配置看需求选。我后来发现可能这个版本不使用这个硬件环境，不过就这么装了也能用。装完了nvidia-smi能看到cuda版本，虽然我这里显示的版本是11.0，但是能用。我怀疑这个可能和驱动有关系……网上有讲这个的，但是目前没有影响使用我就先不管了。然后添加环境变量

export PATH="/usr/local/cuda-10.0/bin:$PATH"
export LD_LIBRARY="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"

没有cudnn不行，TensorFlow编译前配置的时候需要你提供cudnn支持，没有的话编译的时候会遇到cudnn相关的错误。
cudnn是几个动态库和头文件，下载或者找到它们，然后放进对应的目录。cudnn.h放进cuda安装目录cuda-10.0/include目录下，libcudnn.so.7.6.5（这是我用的版本，应该和cuda版本有对应关系）和libcudnn_static.a放进cuda-10.0/lib64目录下，建两个链接libcudnn.so->libcudnn.so.7->libcudnn.so.7.6.5，调用的时候调的是libcudnn.so，保证它存在并指向有效的libcudnn动态库。

5、Nvidia GPU驱动安装
再编译TensorFlow又双叒叕出错了，出现了一个找不到libcud.so.1导致的错误。这个东西我最后也没太看明白，只知道它在/usr/lib里或者/usr/lib/x86_64_linux_gnu下，会指向一个libcuda.so.xxx.xx库（xxx代表了显卡的版本，nvidia-smi里显示的Driver Version），看了一眼我确实没有。实验的结果表明它可能就是跟显卡驱动有关系的，我重新安装了显卡驱动就有了。nvidia官网下个.run的安装文件，运行，提示有什么什么内核模块已经被加载，可能是由于我先装了cuda和nvidia-smi什么的，后来在后面加上了-a -N --ui=none --no-kernel-module参数，即
sh NVIDIA_Linux-x86_64-xxx.run -a -N --ui=none --no-kernel-module
xxx还是对应的GPU版本

6、gcc
万万没想到编译还是不成功，错误提示是缺少gcc的一些头文件，比如stdint.h什么的，估计是gcc版本低了，当时随手装了个4.8.5，后面安装了个devtoolset-7，把gcc升级到了7.3.1就好了。

7、另外还有个abseil需要引用，下载abseil-cpp-master解压就可以了。

三、编译TensorFlow的libtensorflow_cc.so
首先运行./configure
我需要GPU，所以cuda支持要选择y，除了cuda和它相关的其他的用默认选项就可以。
编译命令
#bazel build --config=opt --config=cuda //tensorflow:libtensorflow_cc.so
若要使用GPU需加上--config=cuda选项。
编译成功后生成的两个文件：libtensorflow_cc.so和libtensorflow_framework.so就是最终需要的，它们位于bazel-bin/tensorflow目录下。
…………编译tensorflow最大的难点在于网络，就是有gfw这么个障碍。为什么我大费周章找一个docker就是因为公司服务器下什么都下不来。

四、编译自己的应用程序
1、调用方式
参考了网上的图片处理例程，写了语音识别的demo。

#include <fstream>
#include <utility>
#include <iostream>
 
#include "tensorflow/cc/ops/const_op.h"
#include "tensorflow/cc/ops/image_ops.h"
#include "tensorflow/cc/ops/standard_ops.h"
 
#include "tensorflow/core/framework/graph.pb.h"
#include "tensorflow/core/framework/tensor.h"
 
#include "tensorflow/core/graph/default_device.h"
#include "tensorflow/core/graph/graph_def_builder.h"
 
#include "tensorflow/core/lib/core/errors.h"
#include "tensorflow/core/lib/core/stringpiece.h"
#include "tensorflow/core/lib/core/threadpool.h"
#include "tensorflow/core/lib/io/path.h"
#include "tensorflow/core/lib/strings/stringprintf.h"
 
#include "tensorflow/core/public/session.h"
#include "tensorflow/core/util/command_line_flags.h"

#include "tensorflow/core/platform/env.h"
#include "tensorflow/core/platform/init_main.h"
#include "tensorflow/core/platform/logging.h"
#include "tensorflow/core/platform/types.h"
 
 
using namespace tensorflow::ops;
using namespace tensorflow;
using namespace std;
using tensorflow::Flag;
using tensorflow::Tensor;
using tensorflow::Status;
using tensorflow::string;
using tensorflow::int32 ;
 

int main(int argc, char** argv )
{
    /*--------------------------------配置关键信息------------------------------*/
    string model_path="model.pb";       //直接读入模型的pb文件
    string input_tensor_name="InputName";   //输入节点名
    string output_tensor_name="forPbOutput";    //输出节点名
 
    /*--------------------------------创建session------------------------------*/
    Session* session;
    Status status = NewSession(SessionOptions(), &session);//创建新会话Session
 
    /*--------------------------------从pb文件中读取模型--------------------------------*/
    GraphDef graphdef; //Graph Definition for current model
 
    Status status_load = ReadBinaryProto(Env::Default(), model_path, &graphdef); //从pb文件中读取图模型;
    if (!status_load.ok()) {
        cout << "ERROR: Loading model failed..." << model_path << std::endl;
        cout << status_load.ToString() << "\n";
        return -1;
    }
    Status status_create = session->Create(graphdef); //将模型导入会话Session中;
    if (!status_create.ok()) {
        cout << "ERROR: Creating graph in session failed..." << status_create.ToString() << std::endl;
        return -1;
    }
    cout << "<----Successfully created session and load graph.------->"<< endl;
 
    /*---------------------------------载入测试数据-------------------------------------*/
    cout<<endl<<"<------------loading test_data-------------->"<<endl;

    //read input feature data   //我从一个写有特征数据的文本文件里读取的测试数据，仅含有一条语音，多个帧，每帧长度固定，读入过程受到特征数据文件文本格式影响，参考价值不大。
    fstream featFile;
    featFile.open("testfeat.txt");
    string line;
    getline(featFile, line);
    istringstream str(line);
    string wavName;
    str >> wavName;
    vector<float> feats;
    int row = 0;

    while (1) {
        getline(featFile, line);
        istringstream rowData(line);
        row++;
        string data;
        while (rowData >> data) {
            if (data != "]") {
                float value = atof(data.c_str());
                feats.push_back(value);
            }
        }
        if (data == "]") {
            break;
        }
    }

    cout<<"feature frames = "<<row;

    //create a tensor for input features    //将数值正确放入Tensor，采用了flat函数的方法，及全部数据填入一维向量，再flat
    Tensor resized_tensor(DT_FLOAT, TensorShape({1, row, 320}));    //输入数据的类型和维度与模型的定义有关
    std::copy_n(feats.begin(), feats.size(), resized_tensor.flat<float>().data());
 
    /*-----------------------------------用网络进行测试-----------------------------------------*/
    cout<<endl<<"<-------------Running the model with test_data--------------->"<<endl;
    //前向运行，输出结果一定是一个tensor的vector
    vector<tensorflow::Tensor> outputs;
    string output_node = output_tensor_name;
    Status status_run = session->Run({{input_tensor_name, resized_tensor}}, {output_node}, {}, &outputs);
 
    if (!status_run.ok()) {
        cout << "ERROR: RUN failed..."  << std::endl;
        cout << status_run.ToString() << "\n";
        return -1;
    }
    //把输出值给提取出来
    cout << "Output tensor size:" << outputs.size() << std::endl;
    for (std::size_t i = 0; i < outputs.size(); i++) {
        cout << outputs[i].DebugString()<<endl;
    }
 
    Tensor t = outputs[0];                   // Fetch the first tensor
    //TensorShape shape = t.shape();        //输出为二维
    //cout << shape.dims() << endl;
    //cout << shape.dim_size(1) << endl;
    int output_dim = t.shape().dim_size(1); //每一条结果的长度
    cout << "dim_size =" << output_dim << endl;
    for (int i = 0; i < output_dim; i++) {
        cout << t.tensor<tensorflow::int32, 2>()(0, i) << "  "; //此处的类型与模型的输入输出定义有关
    }
    cout << endl;
    
    
    return 0;
}

2、编译需要引用的头文件目录包括：
tensorflow-1.13.2
eigen-3.3.9
abseil-cpp-master
bazel-genfiles
动态库是libtensorflow_cc.so和libtensorflow_framework.so，路径bazel-bin/tensorflow/
我在tensorflow-1.13.2目录下进行的测试程序编译：
#g++ test.cpp -I . -I eigen.3.3.9 -I abseil-cpp-master -I bazel-genfiles/ -L bazel-bin/tensorflow -ltensorflow_cc -ltensorflow_framework
运行可执行程序前还要导入环境变量：
export LD_LIBRARY_PATH=bazel-bin/tensorflow:PATH

五、在其他环境调用tensorflow动态库编译自己的应用程序
在没有编译环境的机器上，仅仅是要运行编译好的程序，除了需要提供libtensorflow_cc.so和libtensorflow_framework.so以外，也需要安装protobuf、eigen3。
如果想利用libtensorflow_cc.so和libtensorflow_framework.so编译自己的应用程序，除了需要以上的运行环境，还需要添加eigen、abseil，以及bazel编译tensorflow后生成的bazel-genfiles文件，和tensorflow源码下的tensorflow目录、third_party目录。
#g++ test.cpp -I . -I eigen.3.3.9 -I abseil-cpp-master -I bazel-genfiles/ -L bazel-bin/tensorflow -ltensorflow_cc -ltensorflow_framework -I /usr/local/include/eigen3/
运行可执行程序时需要export LD_LIBRARY_PATH加上tensorflow的两个so文件所在路径，以及cuda的库所在路径，一般就是/usr/local/cuda-10.0/lib64。

六、问题记录
1、出现提示cuInit失败，UNKNOWN_ERROR错误。2021.03.02更新，不能使用GPU的问题可能和我用的是个docker有关，把编译好的程序移动到实机上就没事了。
2、出现tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:134] Unknown compute capability (7, 5) .Defaulting to telling LLVM that we're compiling for sm_30错误，应该是不支持compute ability 7.5的，编译的时候configure显示的cudnn支持范围是3.0-7.0，根据报错代码的位置，那个函数里列出了此版本tensorflow支持的compute ability范围，1.13.2确实不包含7.5，看了以下1.14.0包含，我决定试验一下用1.14.0能否避免该问题。
（1）tensorflow-r1.14
编译tensorflow-r1.14的时候出现了from builtins import bytes # pylint: disable=redefined-builtin错误，搜了一下可能是因为python2.7没有builtins？安装future之后就好了：
pip install future
然后又告诉我缺少google/protobuf/port_def.inc，去安装路径下看了一眼真的没有…………查了下可能是3.6.1版本真的没有……我只好卸了重装一个新一点的版本，但是太新了也不行，3.15.1装完再编译tensorlow生成的.pb.h文件里还写着3.7.x，我猜测可能和bazel版本有关。重装了3.7.1是可以了。
3、Resource exhausted: OOM错误
这个错误可能就是显存不够，我缩小了batchsize就好了，但是总觉得不会连500都容不了……可能是跟pb文件的限制有关？不太确定，模型和训练过程我都不知道，是直接使用的pb文件。
4、tensorflow的输出是一个tensor序列，但是根据实验结果看output的size是1，整个batch的结果都存在output[0]这一个tensor里，我不明白如果output序列后面的输出是什么，是与自己定义的图有关还是tensorflow本身的设定？
5、突然遇到一次Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR错误，之前也运行了很多次测试程序，就突然出现了……用rm -rf ~/.nv/删除缓存后倒是解决了，但要作为正式产品给用户布上的话，这个问题要怎么解决？网上还提供了一种让tensorflow运行某种配置的方法，但暂时还没看懂，也没有找到C++的示例。

Tensorflow-1.x源码编译及C++API调用

locking

引用和评论

基于STC12C2052和DFPlayer Mini实现的声光特效

Open WebUI：开源AI交互平台的全面解析

大模型中的Token究竟是什么？从原理到作用深度解析

一文掌握 MCP 上下文协议：从理论到实践

MySQL × 向量数据库：大模型时代的黄金组合实战指南

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！

大模型时代，后端程序员如何避免被AI卷死？