参考:Ubuntu20.04下CUDA、cuDNN的详细安装与配置过程(图文)_嵌入式技术的博客-CSDN博客_ubuntu cudnn安装
【最新】cuDNN在CUDA11.7+Ubuntu20.04下的安装及卸载_weixin_54470372的博客-CSDN博客_dpkg: warning: ignoring request to remove cudnn-lo
官网NVIDIA CUDA Toolkit Documentation
NVIDIA Documentation Center | NVIDIA Developer | NVIDIA CUDA Toolkit
官网NVIDIA cuDNN DocumentationNVIDIA Documentation Center | NVIDIA Developer | NVIDIA cuDNN
一、更新显卡信息,非常重要,否则可能识别出错
sudo update-pciids
二、查看电脑是否有GPU(nivida品牌)
更新前:cgm@cgm:~/opencv-4.2.0/opencv-4.2.0/build$ lspci | grep -i nvidia01:00.0 VGA compatible controller: NVIDIA Corporation Device 2560 (rev a1)01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)
更新命令: cgm@cgm:~/opencv-4.2.0/opencv-4.2.0/build$ sudo update-pciids[sudo] cgm 的密码: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 283k 100 283k 0 0 27816 0 0:00:10 0:00:10 --:--:-- 68336Done.
更新后(GeForce RTX 3060): cgm@cgm:~/opencv-4.2.0/opencv-4.2.0/build$ lspci | grep -i nvidia01:00.0 VGA compatible controller: NVIDIA Corporation GA106M [GeForce RTX 3060 Mobile / Max-Q] (rev a1)01:00.1 Audio device: NVIDIA Corporation GA106 High Definition Audio Controller (rev a1)
更新后正确识别出了显卡型号。
Nvidia 卡信息的末尾是 rev a1,表示独显运行。 Nvidia 卡信息的末尾是 rev ff,表示独显已经关闭。
三、安装NVIDIA显卡驱动
ubuntu20.04 安装NVIDIA驱动很容易,只需要打开系统设置->软件和更新->附加驱动->选择NVIDIA驱动->应用更改。该界面会自动根据电脑上的GPU显示推荐的NVIDIA显卡驱动。
NVIDIA(英伟达)显卡驱动安装完成后,在终端输入nvidia-smi
输出如下图所示的结果就表示安装成功了。下图中Driver Version显示的是当前安装的英伟达驱动版本号470.161.03,CUDA Version显示的是当前驱动版本可以安装的CUDA最高版本号11.4
查看电脑可以安装的版本(如果你的驱动正常不用看下面这些)
下面这个链接是我更新推荐的驱动造成的问题,建议有驱动就不要更新了.
ubuntu因更新驱动开不了机_楚歌again的博客-CSDN博客
ubuntu-drivers devices
安装nvidia驱动,选择上述图片recommend的版本
sudo apt install nvidia-driver-525-openreboot
我安装了这个recommend的版本直接导致了严重的后果.
ubuntu因更新驱动开不了机_楚歌again的博客-CSDN博客
参考上面的链接,我还是安装的470的驱动,之后跳过安装显卡这一步
查看nvidia驱动信息
nvidia-smi
ubuntu20.04/Ubuntu22.04配置cuda和cuDNN_心儿痒痒的博客-CSDN博客
测试驱动是否安装成功以及查看驱动版本
打开终端输入nvidia-smi
,查看输出情况。若驱动安装成功,会输出类似下图的结果。
下图中需要注意的有两点:Driver Version显示的是当前安装的英伟达驱动版本号470.161.03,CUDA Version显示的是当前驱动版本可以安装的CUDA最高版本号11.4
Ubuntu 20.04安装CUDA 11.0, cuDNN - 简书
四.关闭系统自带驱动Nouveau
官网禁用Nouveau文档链接:CUDA Installation Guide for Linux
注意!在安装NVIDIA驱动以前需要禁止系统自带显卡驱动nouveau:可以先通过指令lsmod | grep nouveau查看nouveau驱动的启用情况,如果有输出表示nouveau驱动正在工作,如果没有内容输出则表示已经禁用了nouveau。
我的电脑没有输出,表示nouveau禁用了五、安装CUDA
5.1. 下载与安装CUDA
官网Runfile安装文档的链接: CUDA Installation Guide for Linux
cuda兼容性列表:
Table 2. CUDA Toolkit and Minimum Required Driver Version for CUDA Minor Version Compatibility
CUDA Toolkit | Minimum Required Driver Version for CUDA Minor Version Compatibility* | |
---|---|---|
Linux x86_64 Driver Version | Windows x86_64 Driver Version | |
CUDA 12.0.x | >=525.60.13 | >=527.41 |
CUDA 11.8.x | >=450.80.02 | >=452.39 |
CUDA 11.7.x | >=450.80.02 | >=452.39 |
CUDA 11.6.x | >=450.80.02 | >=452.39 |
CUDA 11.5.x | >=450.80.02 | >=452.39 |
CUDA 11.4.x | >=450.80.02 | >=452.39 |
CUDA 11.3.x | >=450.80.02 | >=452.39 |
CUDA 11.2.x | >=450.80.02 | >=452.39 |
CUDA 11.1 (11.1.0) | >=450.80.02 | >=452.39 |
CUDA 11.0 (11.0.3) | >=450.36.06** | >=451.22** |
可见:安装CUDA11.4 需要 Linux x86_64 Driver Version >=470.82.01
如下图所示,这里以CUDA11.4.0为例,介绍ubuntu20.04系统上CUDA的安装。我们可以从NVIDIA官网CUDA下载页面,网址为https://developer.nvidia.com/cuda-toolkit-archive,点击CUDA Toolkit 11.4.0下载相应版本的CUDA11.4.0。
在如下图所示的界面,以此选择Linux→x86_64→Ubuntu→ 20.04。然后弹出三种安装方法,根据安装经验这里推荐采用runfile(local)方法。这是由于CUDA的安装过程需要很多依赖库文件,CUDA的run文件虽然比另外两种安装方法的文件大,但是它包含了所有的依赖库文件,所以采用相对来说很容易安装成功。
在安装CUDA11.4之前需要首先安装一些相互依赖的库文件:
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
sudo apt-get install libglfw3-dev
下面为安装CUDA11.4.0的Ubuntu安装指令:
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda_11.4.0_470.42.01_linux.run//cuda_11.4.0_470.42.01_linux.run,表示为cuda_cuda版本号_显卡驱动最低要求版本号_操作系统名称.runsudo sh cuda_11.4.0_470.42.01_linux.run
运行上面第二条指令后,稍等片刻,会弹出如下界面,点击Continue
然后再输入accept
。
接着,如下图所示,在弹出的界面中通过Enter
键,取消Driver
和470.42.01
的安装,然后点击Install
,等待
可以仔细阅读一下上面的安装信息:
cgm@cgm:~$ sudo sh cuda_11.4.0_470.42.01_linux.run============ Summary ============Driver: Not SelectedToolkit: Installed in /usr/local/cuda-11.4/Samples: Installed in /home/cgm/Please make sure that - PATH includes /usr/local/cuda-11.4/bin - LD_LIBRARY_PATH includes /usr/local/cuda-11.4/lib64, or, add /usr/local/cuda-11.4/lib64 to /etc/ld.so.conf and run ldconfig as rootTo uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.4/bin***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 470.00 is required for CUDA 11.4 functionality to work.To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run --silent --driverLogfile is /var/log/cuda-installer.log
百度翻译一下
cgm@cgm:~$sudo sh cuda_111.4.0_470.42.01_linux.run============摘要============驱动程序:未选择工具包:安装在/usr/local/cuda-11.4中/示例:安装在/home/cgm中/请确保-PATH包括/usr/local/cuda-11.4/bin-LD_LIBRARY_PATH包含/usr/local/cuda-11.4/lib64,或将/usr/local/cud-11.4/lib64添加到/etc/LD.so。conf并以root身份运行ldconfig要卸载CUDA Toolkit,请在/usr/local/CUDA-11.4/bin中运行CUDA uninstaller***警告:安装不完整!此安装未安装CUDA驱动程序。CUDA 11.4功能运行需要至少470.00版本的驱动程序。要使用此安装程序安装驱动程序,请运行以下命令,将<CudaInstaller>替换为此运行文件的名称:sudo<CudaInstaller>。运行--静音--驱动程序日志文件为/var/log/cuda-installer.log
看一下安装的位置吧
系统安装CUDA包括两个部分:NVIDIA CUDA GPU计算工具包和NVIDIA CUD示例包两个部分。
如下图所示,Ubuntu20.04系统会默认地将CUDA的NVIDIA GPU计算工具包安装到/usr/local/文件夹下面,可以看到该文件夹下多了两个文件夹cuda和cuda-11.4。
看一下样例的位置吧
5.2. 配置CUDA的环境变量
官网环境配置的文档: CUDA Installation Guide for Linux
CUDA安装完成后,需要配置变量环境才能正常使用。首先在终端输入sudo gedit ~/.bashrc打开如下图所示的.bashrc文件。
然后,如下图所示在.bashrc文件的最后添加以下CUDA环境变量配置信息(我从不同的文章中看到这里添加的信息不仅相同,目前还不太清楚具体含义,所以这里仅仅罗列出它们):
sudo gedit ~/.bashrc
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64export PATH=$PATH:/usr/local/cuda/binexport CUDA_HOME=$CUDA_HOME:/usr/local/cuda
我用的是上面这三个 export
注意:上面的CUDA环境变量配置方法有很多,本文的配置方法中的cuda不要指定具体的版本,主要是为了电脑中多个CUDA版本的切换。
有的文章写的是这样(注意版本号)
export PATH=/usr/local/cuda-11.6/bin${PATH:+:${PATH}}export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
有的文章写的是这样(注意版本号)
export PATH=/usr/local/cuda-11.2/bin:$PATHexport LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64:$LD_LIBRARY_PATH
最后,在终端输入source ~/.bashrc
或者重新启终端使之生效。这时,我们就可以在终端输入nvcc -V
查看CUDA的安装信息,如下图所示,至此CUDA安装成功。
2.3. CUDA测试
对CUDA安装是否成功,需要进入NVIDIA CUDA示例包,其位于主目录
/home/cgm/NVIDIA_CUDA-11.0_Samples内,在该文件夹下打开终端,并输入make,等待。然后进入1_Utilities/deviceQuery文件夹,并在终端执行./deviceQuery命令,如下result=PASS则表示安装成功。
小插曲,报错
VulkanBaseApp.cpp:30:10: fatal error: GLFW/glfw3.h: 没有那个文件或目录
30 | #include <GLFW/glfw3.h>
sudo apt-get install libglfw3-dev
再次make
六、cuDNN的安装与检测
6.1. cuDNN的安装
cuDNN官方安装文档链接:Installation Guide :: NVIDIA Deep Learning cuDNN Documentation
从NVIDIA官网的cudnn下载页面上下载与安装CUDA对应的cudnn(需要注册),网址为https://developer.nvidia.com/rdp/cudnn-download。选择Ubuntu20.04系统下,CUDA11.4.0对应的cuDNN v版本,如下图所示:
对下载的cudnn-11.4-linux-x64-v8.2.4.15.tgz进行解压操作,得到一个文件夹cudnn-11.4-linux-x64-v8.2.4.15,命令为:
tar -zxvf cudnn-11.4-linux-x64-v8.2.4.15.tgz
然后,进入cudnn-11.4-linux-x64-v8.2.4.15,并右键->在终端打开使用下面两条指令
复制cuda文件夹下的文件 lib64 到 /usr/local/cuda-11.4/lib64/
复制cuda文件夹下的文件 linclude 到 /usr/local/cuda-11.4/include/。
sudo cp cuda/lib64/* /usr/local/cuda-11.4/lib64/sudo cp cuda/include/* /usr/local/cuda-11.4/include/
拷贝完成后,我们可以使用如下的命令查看cuDNN的信息:
cat /usr/local/cuda-11.4/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
输出下面的信息就是成功了。
6.2. cuDNN的检测
从NVIDIA官网的cudnn下载页面上下载三个.deb
格式的检测文件,如下图所示:
在终端输入如下命令安装下载的三个.deb格式的检测文件:
sudo dpkg -i libcudnn8_8.2.4.15-1+cuda11.4_amd64.deb sudo dpkg -i libcudnn8-dev_8.2.4.15-1+cuda11.4_amd64.deb sudo dpkg -i libcudnn8-samples_8.2.4.15-1+cuda11.4_amd64.deb
查询
sudo dpkg -l | grep cudnn
通过上面三条指令,cuDNN的测试文件会自动安装在系统的/usr/src/cudnn_samples_v8
文件夹下,进入mnistCUDNN
下,执行命令make clean && make
。如果结果如下图所示,则表示cuDNN安装成功。
执行make时报错:
rm -rf *o
rm -rf mnistCUDNN
CUDA_VERSION is 11040
Linking agains cublasLt = true
CUDA VERSION: 11040
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75 80 86
/bin/sh: 1: cannot create test.c: Permission denied
/bin/sh: 1: cannot create test.c: Permission denied
g++: error: test.c: 没有那个文件或目录
g++: warning: ‘-x c’ after last input file has no effect
g++: fatal error: no input files
compilation terminated.
>>> WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
[@] /usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o fp16_dev.o -c fp16_dev.cu
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o fp16_emu.o -c fp16_emu.cpp
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o mnistCUDNN.o -c mnistCUDNN.cpp
[@] /usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
(1)因为有warning:
WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly.
所以先下载libfreeimage:sudo apt-get install libfreeimage3 libfreeimage-dev
(2) Permission denied,命令前添加sudo,即sudo make,成功
sudo make clean && sudo make
cgm@cgm:/usr/src/cudnn_samples_v8/mnistCUDNN$ ./mnistCUDNN Executing: mnistCUDNNcudnnGetVersion() : 8204 , CUDNN_VERSION from cudnn.h : 8204 (8.2.4)Host compiler version : GCC 9.4.0There are 1 CUDA capable devices on your machine :device 0 : sms 30 Capabilities 8.6, SmClock 1425.0 Mhz, MemSize (Mb) 5921, MemClock 7001.0 Mhz, Ecc=0, boardGroupID=0Using device 0Testing single precisionLoading binary file data/conv1.binLoading binary file data/conv1.bias.binLoading binary file data/conv2.binLoading binary file data/conv2.bias.binLoading binary file data/ip1.binLoading binary file data/ip1.bias.binLoading binary file data/ip2.binLoading binary file data/ip2.bias.binLoading image data/one_28x28.pgmPerforming forward propagation ...Testing cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.012288 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.013184 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.049984 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.269312 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 1.657632 time requiring 2057744 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 3.042144 time requiring 184784 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 128848 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.043008 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.082944 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.109344 time requiring 128000 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.315200 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.335872 time requiring 1433120 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.984064 time requiring 128848 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryResulting weights from Softmax:0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 Loading image data/three_28x28.pgmPerforming forward propagation ...Testing cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.011264 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.012288 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.012288 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.037888 time requiring 2057744 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.043008 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.052224 time requiring 184784 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 128848 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.035840 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.045056 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.075776 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.080896 time requiring 128848 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.083744 time requiring 1433120 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.102400 time requiring 128000 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryResulting weights from Softmax:0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 Loading image data/five_28x28.pgmPerforming forward propagation ...Resulting weights from Softmax:0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 Result of classification: 1 3 5Test passed!Testing half precision (math in single precision)Loading binary file data/conv1.binLoading binary file data/conv1.bias.binLoading binary file data/conv2.binLoading binary file data/conv2.bias.binLoading binary file data/ip1.binLoading binary file data/ip1.bias.binLoading binary file data/ip2.binLoading binary file data/ip2.bias.binLoading image data/one_28x28.pgmPerforming forward propagation ...Testing cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 28800 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.012288 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.012576 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.025600 time requiring 28800 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.049440 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.054272 time requiring 184784 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.054272 time requiring 2057744 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.039936 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.045056 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.047104 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.075776 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.086016 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.086016 time requiring 1433120 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryResulting weights from Softmax:0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 Loading image data/three_28x28.pgmPerforming forward propagation ...Testing cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 28800 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.010240 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.013376 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.025408 time requiring 28800 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.040128 time requiring 178432 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.044992 time requiring 2057744 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.060416 time requiring 184784 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnGetConvolutionForwardAlgorithm_v7 ...^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryTesting cudnnFindConvolutionForwardAlgorithm ...^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.041984 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.043808 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.046080 time requiring 2450080 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.081920 time requiring 0 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.086912 time requiring 1433120 memory^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.090080 time requiring 4656640 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memoryResulting weights from Softmax:0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 Loading image data/five_28x28.pgmPerforming forward propagation ...Resulting weights from Softmax:0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 Result of classification: 1 3 5Test passed!
七、CUDA的卸载
注意在安装界面有这么一句话:
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.4/bin
进入到/usr/local/cuda-11.4/bin目录下,而不是cuda目录。然后打开终端,输入sudo ./cuda-uninstaller。
输入命令后,弹出如下界面,通过回车键选中三个选项,最后选中Done。执行完下面指令后,上面的cuda文件就删除了。
最后,在终端输入命令sudo rm -rf /usr/local/cuda-11.4,就可以最终卸载CUDA11.4和cuDNN v8.2.4了。