In this tutorial, we will be building OpenCV from source with CUDA backend support (OpenCV-DNN-CUDA module).
IMPORTANT: The OpenCV-DNN-CUDA module only supports inference. So although you will get much faster inference out of it, the training however will be the same as for the OpenCV we set up without CUDA backend support.
STEPS
1. Install CUDA & cuDNN.
2. Install CMake GUI.
3. Install Anaconda Individual edition.
4. Install Visual Studio with the “Desktop development for C++” module.
5. Create a virtual environment in Anaconda
6. Build OpenCV from source for your NVidia GPU.
STEP 1) Install CUDA & cuDNN
Download the latest version of CUDA for your system and its corresponding cuDNN archive. Go to https://developer.nvidia.com/cuda-downloads to download the latest CUDA Toolkit. You can also download previous versions from Archive of Previous CUDA Releases OR under the Resources section at the cuda-downloads link given above. You can download the latest version of cuDNN from https://developer.nvidia.com/cudnn . You can also download previous versions from cudnn archive at https://developer.nvidia.com/rdp/cudnn-archive .
Install CUDA by running the CUDA exe file and next extract and copy the cuDNN folder and its contents to where we installed CUDA. Usually, that is in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.x. The system also automatically adds this path to environment variables as CUDA_PATH. We need to add a few paths to the Path variable for CMake GUI to find cuDNN. Read about the complete CUDA & cuDNN installation here.
Note: As mentioned on the CUDA & cuDNN setup blog above, you have to add the same paths for the Path system variable as you did for the CUDNN system variable. After adding these paths, reboot the system. This is just to ensure that the CMake GUI software which we will be using next is able to find cuDNN on our system.
STEP 2) Install CMake GUI
Next, we need to install CMake GUI for Windows. (Link given) https://cmake.org/download/
Select the installer for your system. I have Windows (64-bit). Download the setup, run and install it. Also, add it to your PATH during installation.
STEP 3) Install Anaconda Individual Edition For Windows
- Download and run the Anaconda installer from here.
- Select the “Just Me” option while installing Anaconda to avoid any admin privilege issues.
- Select the option to make Anaconda python your default and also select the other option to include Anaconda python in your PATH variable. This ensures that Anaconda python is found first in the system. You can always change this by going to your environment variables and moving the python you want as default to the top above Anaconda python.
.
STEP 4) Install MSVC (Microsoft Visual Studio)
Go to MSVC: https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=Community. This will download the MSVC Community edition setup exe file for you. Run it. Select the “Desktop development with C++” package as shown below and click on install. If you already have another version of the visual studio like visual studio professional or enterprise, make sure the above module is installed.
STEP 5) Create a virtual environment in Anaconda
NOTE: I am doing this step as I am building OpenCV for a virtual environment. If you want to build for your base environment you can skip this step and move on to Step 6.
● First, open Anaconda prompt and run the following command to create a new Anaconda environment.
conda create -n env_NAME python=3.9
(Where env_NAME is the name of the environment you want to create. I am setting python=3.9)
- To Activate the environment use the following command:
conda activate env_NAME
- Install NumPy for this environment
pip install numpy
#OR
pip install --upgrade numpy
STEP 6) Lastly, build OpenCV from source with CUDA backend support for your specific NVidia GPU.
This is the main step for setting up the OpenCV-DNN-CUDA module i.e. building OpenCV from source with CUDA backend support. This gives us faster inference for object detection models. We will be installing some libraries here.
IMPORTANT: Before we begin, if you are creating this for your base environment, make sure to install NumPy using pip install numpy . Else, if you are doing this for a virtual environment, we already installed numpy in step 5.
● 6) a) First, download OpenCV and opencv_contrib.
Download the latest OpenCV sources files from either the official OpenCV releases page or the GitHub page. Next, download the exact version of opencv_contrib as the OpenCV sources file you downloaded. Use the following links to download both these files. (The latest OpenCV version as of now is 4.5.5)
⇒ https://opencv.org/releases/
⇒ https://github.com/opencv/opencv_contrib
Note: Use the same version for both opencv & opencv_contrib. To select a specific version click on the master dropdown on the top left, click on Tags, and select the version.
● 6) b) Create a new folder in your C drive named opencv. Extract and copy the folders from the previous step to the C:\opencv folder.
● 6) c) Create another folder named build folder inside C:\opencv
NOTE: Before running CMake, you need to have CUDA, cuDNN installed and configured on your system. Refer to this blog to learn how to install and set up CUDA and CUDNN on Windows.
● 6) d) Open CMake GUI. Select the source as C:\opencv\opencv-4.5.5 and select the destination for building the binaries as C:\opencv\build.
● 6) e) Check the Grouped entries box and click on Configure. Set the “generator for this project” as your Visual Studio version. I have 2017 so I will set Visual Studio 15 2017. And lastly, choose the “Optional platform for generator” as x64.
● 6) f) Once the first configuration is done, select the following modules by typing them in the search box and ticking their boxes.
WITH_CUDA
OPENCV_DNN_CUDA
ENABLE_FAST_MATH
BUILD_opencv_world
BUILD_opencv_dnn
BUILD_opencv_python3
(Some of these are usually selected by default but double-check them as these are important)
● 6) g) Next type the following and set the path to the modules folder inside the opencv_contrib folder we extracted in Step 6)b).
OPENCV_EXTRA_MODULES_PATH
Give the path→ C:\opencv\opencv_contrib-4.5.5\modules
● 6) h) Next, do this step only if you are building OpenCV for the virtual environment we created in Step 5. If not you can skip to Step 6) i). I am doing this as I am installing OpenCV in my virtual environment.
Type PYTHON3 in the search box. You will see all the following python3 path entries pointing to the libraries in the base anaconda environment if you have set anaconda as your default python else you will see the windows python base environment libraries if that is your default python.
We need to set them to point to the virtual environment libraries of the same type. (Note: This is for the virtual environment env_NAME we created in Anaconda prompt in Step 5. Also, here my Anaconda location is C:\anaconda3 . Yours may be different eg: in C:\Program Data\anaconda3 etc.)
PYTHON3_EXECUTABLE
Change ⇒ C:\anaconda3\python.exe →
to ⇒ C:\anaconda3\envs\env_NAME\python.exe
PYTHON3_INCLUDE_DIR
Change ⇒ C:\anaconda3\include →
to ⇒ C:\anaconda3\envs\env_NAME\include
PYTHON3_LIBRARY
Change ⇒ C:\anaconda3\libs\python39.lib →
to ⇒ C:\anaconda3\envs\env_NAME\libs\python39.lib
PYTHON3_NUMPY_INCLUDE_DIRS
Change ⇒ C:/anaconda3/Lib/site-packages/numpy/core/include →
to ⇒ C:/anaconda3/envs/env_NAME/Lib/site-packages/numpy/core/include
PYTHON3_PACKAGES_PATH
Change ⇒ C:/anaconda3/Lib/site-packages →
to ⇒ C:/anaconda3/envs/env_NAME/Lib/site-packages
● 6) i) Click on Configure again.
● 6) j) Once the second configuration is done, you will see your CUDA and cuDNN in the output. Now, select the following by typing it in the search box and ticking its box.
CUDA_FAST_MATH
● 6) k) Next, set your GPU’s Compute capability as the CUDA_ARCH_BIN. (You can check your GPU’s cc here on this link → https://developer.nvidia.com/cuda-gpus . Go to this link and select your GPU section. For eg: I have a GeForce GPU so I will select the section “CUDA-Enabled GeForce and TITAN products”. This will give a dropdown of all the GPUs with their compute capabilities. Mine is 6.1)
CUDA_ARCH_BIN 6.1
● 6) l) Find CMAKE_CONFIGURATION_TYPES. Remove the Debug option and only set the Release option.
CMAKE_CONFIGURATION_TYPES Release
● 6) m) Configure one last time.
● 6) n) After the third and last configuration is finished, verify all the outputs and finally click on Generate. This generates the project with these settings which we can use to build our OpenCV.
● 6) o) Next, once the generating is done. Open a command prompt in the C:\opencv directory and run the following command.
"C:\Program Files\CMake\bin\cmake.exe" --build "C:\opencv\build" --target INSTALL --config Release
This will build the OpenCV-DNN module with CUDA backend. This process can take up to an hour or so.
● 6) p) Lastly, set the following paths in your environment variables and reboot your system.
Set a new variable OpenCV_DIR in System Variables to point to the following three. →
- ‣ C:\opencv\build\install\x64\vc15\bin
- ‣ C:\opencv\build\install\x64\vc15\lib
- ‣ C:\opencv\build
Also, add the following path to the Path variable in System Variables. →
- ‣ C:\opencv\build\install\x64\vc15\bin
After adding the paths reboot your system.
That’s it! We have successfully built the OpenCV-DNN module with CUDA backend support.
Check if cv2 is installed on your system. Run the following commands in Anaconda prompt:
conda activate opencv_dnn_cuda
python
Python 3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'4.5.5'
>>>
Check if the OpenCV is installed correctly with CUDA backend support using the test_DNN_CV.py script below.
You can test if the OpenCV-DNN module was installed correctly using the script below. The GPU output will be much faster than the CPU output.
import numpy as np
import cv2 as cv
import time
npTmp = np.random.random((1024, 1024)).astype(np.float32)
npMat1 = np.stack([npTmp,npTmp],axis=2)
npMat2 = npMat1
cuMat1 = cv.cuda_GpuMat()
cuMat2 = cv.cuda_GpuMat()
cuMat1.upload(npMat1)
cuMat2.upload(npMat2)
start_time = time.time()
cv.cuda.gemm(cuMat1, cuMat2,1,None,0,None,1)
print("CUDA using GPU --- %s seconds ---" % (time.time() - start_time))
start_time = time.time()
cv.gemm(npMat1,npMat2,1,None,0,None,1)
print("CPU --- %s seconds ---" % (time.time() - start_time))