WSL2 Conda Tensorflow问题记录

创建时间
Apr 14, 2024 04:28 AM
编辑日期
Last updated May 25, 2024
属性
标签

1. 安装CUDA Toolkit

WIN11已安装NVIDIA驱动的情况下,不需要在WSL中安装Linux版的驱动。直接本地安装CUDA Toolkit 12.3。可能报错:/sbin/ldconfig.real: /usr/lib/wsl/lib/libcuda.so.1 is not a symbolic link
用powershell执行完后重启电脑,再到wsl里执行sudo ldconfig
cd C:\Windows\System32\lxss\lib rm libcuda.so rm libcuda.so.1 wsl -e /bin/bash ln -s libcuda.so.1.1 libcuda.so.1 ln -s libcuda.so.1.1 libcuda.so
执行nvcc -V时报错, 首先vim ~/.bashrc 添加以下两行后更新文件source ~/.bashrc 即可。
export PATH=/usr/local/cuda/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
为了自动配置,执行
mkdir -p $CONDA_PREFIX/etc/conda/activate.d echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

2. 安装cuDNN 仅旧版本

sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb 解压后根据提示操作,然后sudo apt-get update 后会发现有,进入到这个目录后ls,能发现有三个libcudnn8 开头的文件。
Get:2 file:/var/cudnn-local-repo-ubuntu2204-8.9.7.29 InRelease [1572 B]
接下来分别执行,${}替换为实际版本就能完成安装。
sudo apt-get install libcudnn8=${cudnn_version}-1+${cuda_version} sudo apt-get install libcudnn8-dev=${cudnn_version}-1+${cuda_version} sudo apt-get install libcudnn8-samples=${cudnn_version}-1+${cuda_version}

2. 安装TF

别执行这条安装命令pip install tensorflow[and-cuda] 会安装为2.16版本(已经包括了cuda/cudnn一系列需要的库)有一些问题参考
目前已有的解决方法是分别在这两个路径下创建两个文件。
首先进行以下操作:
vim $CONDA_PREFIX/etc/conda/deactivate.d/env_vars.sh 可能会需要用到midir -p参考
anaconda3/envs/<ENV_NAME>/etc/conda/activate.d/env_vars.sh
#!/bin/sh export NVIDIA_DIR=$(dirname $(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))) export LD_LIBRARY_PATH=$(echo ${NVIDIA_DIR}/*/lib/ | sed -r 's/\s+/:/g')${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
第二个 anaconda3/envs/<ENV_NAME>/etc/conda/deactivate.d/env_vars.sh
#!/bin/sh unset NVIDIA_DIR unset LD_LIBRARY_PATH
完成后再执行安装命令。正常情况下是能够得到GPU信息的。
python3 -c "imporbasht tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

3. 提示信息

  1. oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
在导入模块前设置环境变量。
import os os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
2. This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
可忽略,或者编译安装。时间会比较久,只推荐任务计算量大的情况下使用这种方法安装。
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
  1. W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
需要就安装,忽略。上面的数值改为2。
  1. Your kernel may have been built without NUMA support.
每次运行前执行
CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)")) export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib
notebook在最顶首先执行以下内容:
import os os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0' os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' !CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)")) !export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib
PennyLane不支持keras 3,添加TF_USE_LEGACY_KERAS变量为1。

jupyter改路径
jupyter-notebook --generate-config查找c.ServerApp.notebook_dir = '' 添加保存,没有就加一行。