Welcome to my GitHub

https://github.com/zq2599/blog_demos

Content: Classification and summary of all original articles and supporting source code, involving Java, Docker, Kubernetes, DevOPS, etc.;

Overview of this article

  • I have a 2015 Lenovo notebook with a GTX950M graphics card. Ubuntu 16.04 LTS desktop version has been installed. In order to use its GPU to complete the deeplearning4j training work, I installed CUDA and cuDNN by myself. I will record the whole process here for future use. For reference, the entire installation process is divided into the following steps:
  • Ready to work
  • Install Nvidia driver
  • Install CUDA
  • Install cuDNN

Special problem description

  • According to the general steps, after installing the Nvidia graphics driver, the corresponding CUDA version will be prompted, and then follow the prompt version to install CUDA, for example, I am prompted here is 11.2, under normal circumstances, I should install 11.2 version of CUDA
  • But I chose the 9.1 version to install, because in the previous development, it was found that after deeplearning4j used the 11.2 SDK, there would be a ClassNotFound error when starting the application. This problem has not been fixed so far (shame, Xinchen's level is so low...), so , I still installed version 9.1 when the Nvidia driver prompts version 11.2, and later I run the deeplearning4j application in this environment and everything is normal
  • If you do not have problems like mine, you can install CUDA according to the version specified by the driver. The specific steps will be described in detail later;

Ready to work

  • The following operations, except for downloading on the web, are all operated by ssh remotely connecting to the ubuntu machine. The SSH login account is a normal account, not root
  • If there is a driver, please delete it first:
sudo apt-get remove --purge nvidia*
  • Disable nouveau driver ( is very important ), open the file <font color="blue">/etc/modprobe.d/blacklist.conf</font> with vi, add the following content at the end, then save and exit:
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
  • Close nouveau:
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
  • Update initramfs:
update-initramfs -u
  • Execute reboot to restart the computer
  • After restarting, execute the following command, there should be no output, proving that nouveau has been disabled:
lsmod|grep nouveau
  • Obtain the Kernel source:
sudo apt-get install linux-source
  • The information displayed during the installation process is as follows:

在这里插入图片描述

  • According to the information in the red box above, the kernel version number is <font color="blue"></font>, so execute the following command:
sudo apt-get install linux-headers-4.4.0-210-generic

Download and install Nvidia driver

在这里插入图片描述

  • After clicking the <font color="blue">search</font> button in the figure above, enter the page below and click to download:

在这里插入图片描述

  • The downloaded file is named <font color="blue">NVIDIA-Linux-x86_64-460.84.run</font>
  • Close the graphics page:
sudo service lightdm stop
  • Add executable permissions to the driver file:
sudo chmod a+x NVIDIA-Linux-x86_64-460.84.run
  • start installation:
sudo ./NVIDIA-Linux-x86_64-460.84.run -no-x-check -no-nouveau-check -no-opengl-files
  • When you encounter the following picture, select the red box:

在这里插入图片描述

  • When you encounter the following picture, press Enter directly:
    在这里插入图片描述
  • Restore the graphics page:
sudo service lightdm start
  • Execute the command <font color="blue">nvidia-smi</font>, if the driver is installed successfully, the following content will be displayed:
will@lenovo:~/temp/202106/20$ nvidia-smi
Sun Jun 20 09:02:11 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.84       Driver Version: 460.84       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 950M    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   41C    P0    N/A /  N/A |      0MiB /  4046MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
  • It can be seen from the above content that <font color="blue">CUDA Version: 11.2</font> means that the corresponding CUDA version of the driver should be <font color="red">11.2</font>. As mentioned earlier, my I encountered a problem, so version 9.1 will be installed next, but you can choose to install <font color="red">11.2</font>

Install CUDA

在这里插入图片描述

  • Download the Linux version as shown below:

在这里插入图片描述

  • Continue to select <font color="blue">x86_64</font>:

在这里插入图片描述

  • Select the specific Linux version and its version number:

在这里插入图片描述

  • There are a lot of things to download, an installer and three patches:

在这里插入图片描述

  • The download addresses of the above four files are organized as follows:
https://developer.download.nvidia.cn/compute/cuda/9.1/secure/Prod/local_installers/cuda_9.1.85_387.26_linux.run?P0Ntu_6NLtuuEMm6fJRk1W5vl4KM7oaT1oFW870zKJ-zDw2ckKntFLOE6klRJfw2CmTa8z3Q390_6urlgc6LqjoqlIFW9gvfvDCusnINYplLaw1u8lRY8R4oVNtpNzaXU4BQcHjvdb6c6rjq20dktCcRd4640woXt1yHmD95v1Du7wdBBXq2eOY

https://developer.download.nvidia.cn/compute/cuda/9.1/secure/Prod/patches/1/cuda_9.1.85.1_linux.run?yeXf_7wIGlHAUw--E_YVLQZRgXv0x2i043woJVY-ydXU5Kyhc-eYQf5JmL-4mvYmlvPYCEc5RhT2sDWscX20CJbdOwpkt30kWb9vx8E4oIlajDQ3MVPvXdiKKsIOBUx-h0q0N0jSkNn80VMhW-nk8jwvRY_e6MuFzqWBaPk

https://developer.download.nvidia.cn/compute/cuda/9.1/secure/Prod/patches/2/cuda_9.1.85.2_linux.run?5jGZxNigaOJkaaPbMagjhSW7ebQvYGyYoqe2vBxZ1eV8qp2BzXJLxIPgAo11UgWhORirQkdJGq5b8eFh4aShBVUTmuPaasvRiMCKDZw5yjjIobGQrCEyU-LFO59AbrRER57Mxa0T1Sc97fC80IOZq8Ox2repjn7A3oYVgd8

https://developer.download.nvidia.cn/compute/cuda/9.1/secure/Prod/patches/3/cuda_9.1.85.3_linux.run?CxWimJTC-XROYihig-UZmH62odbJInf1fmxTZ_bsW1nQ0Zz5cL5r8qLmlMR_1j2rVhk3j8Z5lS6dpArt8frjGHH2MeVn5TefMoclam8udm-RSMMmqHXYE66hHN2D0drVEdtCwe8ZrEIYb2rpucaz9svCFE8Z319mge4Ju94
  • After the download is complete, execute the command <font color="blue">chmod a+x *.run</font> to increase the executable permissions for the above four files
  • Install CUDA:
sudo sh cuda_9.1.85_387.26_linux.run
  • When you encounter a license, like using the vi tool, enter ":" and then enter "q" to enter, you can skip the license reading and perform the actual installation operation:

在这里插入图片描述

  • Next is a series of questions. The answer to each question is as shown in the figure below. pay attention to the question in the red box. <font color="red"> 1615a3d0316023 n </font>:

在这里插入图片描述

  • After the installation is complete, the following will be output:
Installing the CUDA Toolkit in /usr/local/cuda-9.1 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Missing recommended library: libGL.so

Installing the CUDA Samples in /home/will ...
Copying samples to /home/will/NVIDIA_CUDA-9.1_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-9.1
Samples:  Installed in /home/will, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-9.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.1/lib64, or, add /usr/local/cuda-9.1/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.1/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.1/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_13425.log
  • Open the file <font color="blue">~/.bashrc</font> and add the following two lines at the end (If LD_LIBRARY_PATH already exists, please refer to the wording of PATH and change it to append):
export PATH=/usr/local/cuda-9.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64
  • Execute the command <font color="blue">source ~/.bashrc</font> to make the configuration effective
  • Execute the command <font color="blue">su -</font> to switch to the root account and execute the following commands (don’t use sudo, but switch to the root account):
sudo echo "/usr/local/cuda-9.1/lib64" >> /etc/ld.so.conf
  • Then execute the following command as root:
ldconfig
  • Execute the command <font color="blue">exit</font> to exit the root identity, and now it is the identity of the ordinary account
  • Execute the command <font color="blue">nvcc -V</font> to check the CUDA version. Note that the parameter V is uppercase:
will@lenovo:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
  • Install the first patch:
sudo sh cuda_9.1.85.1_linux.run
  • Install the second patch:
sudo sh cuda_9.1.85_387.26_linux.run
  • Install the third patch:
sudo sh cuda_9.1.85_387.26_linux.run

Install cuDNN

在这里插入图片描述

  • Log in as prompted. If you don’t have an account, please register one. After logging in, you will enter the download page. You need to click the red box in the figure below to see the old version:

在这里插入图片描述

  • Choose the version that matches CUDA:

在这里插入图片描述

  • After downloading, unzip and get the folder <font color="blue">cuda</font>, and then execute the following command:
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
  • Execute the check and confirm command <font color="blue">cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2</font>, if the installation is successful, the following output will be displayed:
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 3
--
#define CUDNN_VERSION    (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"
  • At this point, the installation of CUDA (9.1) and cuDNN on Ubuntu 16 has been completed, and I hope to give you some reference.

You are not alone, Xinchen and original are with you all the way

  1. Java series
  2. Spring series
  3. Docker series
  4. kubernetes series
  5. database + middleware series
  6. DevOps series

Welcome to pay attention to the public account: programmer Xin Chen

Search "Programmer Xin Chen" on WeChat, I am Xin Chen, and I look forward to traveling the Java world with you...
https://github.com/zq2599/blog_demos

程序员欣宸
147 声望24 粉丝

热爱Java和Docker