Monday, June 27, 2016

openCV 3.1.0 optimized for Raspberry Pi, with libjpeg-turbo 1.5.0 and NEON SIMD support

This is a small log for myself on building openCV 3.1.0 on a Raspberry Pi 2. This should work on Raspberry Pi 3 too (but not on RPi 1 as it does not support NEON).

  1. Getting required libraries and stuff:
    • I am not mentioning much here, just get the normal required dependencies for building openCV 3.1.0
  2. Getting our core packages:
    • openCV 3.1.0:
      wget -O opencv.tar.gz
    • openCV 3.1.0 Extras:
      wget -O opencv_contrib.tar.gz
    • libjpeg-turbo 1.5.0:
      wget -O libjpeg-turbo.tar.gz
  3. Decompress everything:
    • tar xvf opencv.tar.gz
      tar xvf opencv_contrib.tar.gz
      tar xvf libjpeg-turbo.tar.gz
  4. Compiling all packages:
    • libjpeg-turbo 1.5.0
      cd libjpeg-turbo-1.5.0/
      mkdir build
      autoreconf -fiv
      cd build
      export CFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -fPIC -O3"
      export CXXFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -fPIC -O3"
      sh <path_to_the_source_code>/configure
      The magic is at the C/CXX FLAGS, which optimize our generated binaries to tune with our CPU and use NEON Hardware FPU.
      Moreover, the libjpeg-turbo itself also contains NEON SIMD instructions which speed up the JPEG en/decode process.
      Now, compile libjpeg-turbo with:
      make -j4
      "-j4" means running the process with 4 threads. In case of any errors (such as gcc or its child processes got killed due to insufficient memory), reduce the number, or omit this flag.
      Make sure you have also do a "make clean" after something went wrong and you wish to recompile.
      After the process is done, install libjepg-turbo with:
      sudo make install
      The result binaries will be reside in /opt/libjpeg-turbo/
    • openCV 3.1.0 with extras
      I have modified the default CMake scripts to modfiy the default NEON C/CXX FLAGS:
      cd opencv-3.1.0/cmake/
      nano OpenCVCompilerOptions.cmake
      Go to line 27 and 28, add:
      -mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard
      Also, go to line 150, under the section of "if (ENABLE_NEON)", modify the parameter of "add_extra_compiler_option" from
      Now, save the file (Ctrl+O), and:
      cd ../
      mkdir build
      cd build
      export CFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard" # Notice here does not have -fPIC and -O3
      export CXXFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard" # Notice here does not have -fPIC and -O3
      cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=/usr/local -DINSTALL_C_EXAMPLES=OFF -DINSTALL_PYTHON_EXAMPLES=OFF -DOPENCV_EXTRA_MODULES_PATH=<path_to_opencv_contrib-3.1.0>/modules -DBUILD_EXAMPLES=ON -DWITH_FFMPEG=OFF -DWITH_V4L=OFF -DWITH_LIBV4L=OFF -DENABLE_NEON=ON -DEXTRA_C_FLAGS=-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -DEXTRA_CXX_FLAGS=-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -DWITH_JPEG=ON -DBUILD_JPEG=OFF -DJPEG_INCLUDE_DIR=/opt/libjpeg-turbo/include/ -DJPEG_LIBRARY=/opt/libjpeg-turbo/lib32/libjpeg.a ..
      make -j2 # You can use -j4, but my gcc crashed with -j4, you can test by yourself
      I am building the openCV for C++ (and no python wrappers), no FFMPEG and V4L included in the build. You may adjust the build flags according to your needs.
      After 3 to 4 hours, the build should be done, and you can install openCV with:
      sudo make install
      sudo ldconfig

And that's it! You have successfully built and install openCV 3.1.0, optimized for Raspberry Pi with NEON FPU, and with libjpeg-turbo 1.5.0 with NEON SIMD instructions and NEON FPU support!

1 comment:

  1. amazing tutorial , could you replicate this in opencv 4.0.0? i have no idea how to achieve the same since they changed cmake files and i dont understand them very well