Ubuntu16安装TensorFlow-gpu+PyTorch-摩杜云开发者社区

安装Torch-cuda10.0
写在前面

本文时间
环境

显卡驱动
cuda
cudnn
conda下安装tensorflow-gpu
测试
参考

安装Torch-cuda10.0

python3.7

# CUDA 10.0
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0

写在前面

本文是针对Ubuntu的,windows请查看:win10安装tensorflow-gpu1.11+cuda9+cudnn7

本文时间

2020-6-28

环境

y7000+ubuntu16+1050ti

显卡驱动

Ubuntu16安装TensorFlow-gpu+PyTorch

cuda

Ubuntu16安装TensorFlow-gpu+PyTorch

sudo sh cuda*

注意

第一个是显卡驱动,不安装!!!

Ubuntu16安装TensorFlow-gpu+PyTorch

安装结果如下

Ubuntu16安装TensorFlow-gpu+PyTorch

设置CUDA的环境变量
在zshrc中编辑如下:

#####CUDA10.0#######2020-6-28
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64
export PATH=$PATH:/usr/local/cuda-10.0/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-10.0

输入nvcc -V发现打印如下,则成功安装cuda

➜  ubuntucuda10.0 cat /usr/local/cuda-10.0/version.txt 
CUDA Version 10.0.130
➜  ubuntucuda10.0 sudo vim ~/.zshrc                                           
➜  ubuntucuda10.0 source ~/.zshrc       
➜  ubuntucuda10.0 nvcc -V   
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

cudnn

安装这3个包

Ubuntu16安装TensorFlow-gpu+PyTorch

➜  ubuntucuda10.0 sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.0_amd64.deb 
正在选中未选择的软件包 libcudnn7。
(正在读取数据库 ... 系统当前共安装有 369138 个文件和目录。)
正准备解包 libcudnn7_7.6.5.32-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7 (7.6.5.32-1+cuda10.0) ...
正在设置 libcudnn7 (7.6.5.32-1+cuda10.0) ...
正在处理用于 libc-bin (2.23-0ubuntu11) 的触发器 ...
➜  ubuntucuda10.0 sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.0_amd64.deb 
正在选中未选择的软件包 libcudnn7-dev。
(正在读取数据库 ... 系统当前共安装有 369144 个文件和目录。)
正准备解包 libcudnn7-dev_7.6.5.32-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7-dev (7.6.5.32-1+cuda10.0) ...
正在设置 libcudnn7-dev (7.6.5.32-1+cuda10.0) ...
update-alternatives: 使用 /usr/include/x86_64-linux-gnu/cudnn_v7.h 来在自动模式中提供 /usr/include/cudnn.h (libcudnn)
➜  ubuntucuda10.0 sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.0_amd64.deb 
正在选中未选择的软件包 libcudnn7-doc。
(正在读取数据库 ... 系统当前共安装有 369150 个文件和目录。)
正准备解包 libcudnn7-doc_7.6.5.32-1+cuda10.0_amd64.deb  ...
正在解包 libcudnn7-doc (7.6.5.32-1+cuda10.0) ...
正在设置 libcudnn7-doc (7.6.5.32-1+cuda10.0) ...

验证cudnn是否正确安装
命令:

cd /usr/src/cudnn_samples_v7/mnistCUDNN
sudo make clean
sudo make（出错了，提示没有安装g++,那就安装一下,这里大家遇到的问题可能都不太一样，就是看他缺啥，咱就补啥就行）
//卸载g++:
sudo apt-get remove g++
//重装：
sudo apt-get install g++
./mnistCUDNN

结果如下:

➜  ubuntucuda10.0 cd /usr/src/cudnn_samples_v7/mnistCUDNN                 
➜  mnistCUDNN sudo make clean 
rm -rf *o
rm -rf mnistCUDNN
➜  mnistCUDNN sudo make      
Linking agains cublasLt = false
CUDA VERSION: 10000
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 30 35 50 53 60 61 62 70 72 75
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o fp16_dev.o -c fp16_dev.cu
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include  -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
➜  mnistCUDNN ./mnistCUDNN                               
cudnnGetVersion() : 7605 , CUDNN_VERSION from cudnn.h : 7605 (7.6.5)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms  6  Capabilities 6.1, SmClock 1620.0 Mhz, MemSize (Mb) 4040, MemClock 3504.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.016384 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.026176 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.031232 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.091136 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.167936 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.014336 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.026272 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.034688 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.081920 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.179200 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!
➜  mnistCUDNN ls         
data          fp16_dev.h    fp16_emu.h  gemv.h      mnistCUDNN.cpp
error_util.h  fp16_dev.o    fp16_emu.o  Makefile    mnistCUDNN.o
fp16_dev.cu   fp16_emu.cpp  FreeImage   mnistCUDNN  readme.txt

看到Test passed!则证明安装成功了~

conda下安装tensorflow-gpu

新建一个conda环境

Ubuntu16安装TensorFlow-gpu+PyTorch

安装tensorflow-gpu-1.13

pip install tensorflow-gpu==1.13.1

安装结果如下:

Ubuntu16安装TensorFlow-gpu+PyTorch

测试

Ubuntu16安装TensorFlow-gpu+PyTorch