Electronics and geek stuff: 2016

Sunday, October 23, 2016

"Failed to set PTK to the driver"

I want to net boot my machine through WiFi (Intel AC 7260, using iwlwifi driver), and I have created a custom initramfs to do it. However, I have trouble to get it connect to an encrypted network (WPA2 AES(CCMP)) at the early stage of boot (initramfs stage).

I have found wpa_supplicant tries to authenticate with my AP. However, at the association stage, it fails with:

Failed to set PTK to the driver

After some research, I found that I have missed some critical kernel crypto modules, and I believe this is the reason for the failure of connection. Here is the list of modules:

ccm
ctr
lib80211 (I am not sure about this module)

Adding those modules to my /etc/initramfs-tools/modules (as I am creating a custom initramfs) solves my problem.

Hope this helps if you encounter similar issue.

Monday, October 10, 2016

Compiling DIR-818LW (RTL8881AB) stock firmware

After getting all the required packages, and following the readme file of the provided GPL source files, I still met some compilation errors. It is predictable as vendors really don't give a f*** on open source.

However, with some bug fixing, I can get the firmware image compiled:

Remember to use bash as main sh (unlink /bin/sh && ln -sf /bin/bash /bin/sh)
Setup the env vars, and run "make" two times (see the readme)
Modify the main Makefile, find "read answer;" replace the NEXT LINE to:
```
if [ $answer == "yes" ]; then \ 
```
Modify templates/aries/progs/Makefile, comment out line 54
Modify progs.gpl/busybox-1.14.1/Makefile, comment out line 833 (the touch command), line 500 (as there is no slientoldconfig exists in the make file)
Perform the following make command on progs.gpl/busybox-1.14.1
```
make oldconfig
```
And copy progs.template/busybox-1.14.1.config to progs.gpl/busybox-1.14.1, rename the file to .config

Apply the following patch on kernel/kernel/timeconst.pl:

diff --git a/kernel/timeconst.pl b/kernel/timeconst.pl
index eb51d76..0461239 100644
--- a/kernel/timeconst.pl
+++ b/kernel/timeconst.pl
@@ -365,14 +365,14 @@ if ($hz eq '--can') {
  print "\n);\n";
 } else {
  $hz += 0;   # Force to number
  if ($hz < 1) {
   die "Usage: $0 HZ\n";
  }
 
  @val = @{$canned_values{$hz}};
- if (!defined(@val)) {
+ if (!@val) {
   @val = compute_values($hz);
  }
  output($hz, @val);
 }
 exit 0;

Run the make command third time (see readme)

After 2 to 3 hours (it varies), the firmware image should be available at images/.

To burn the firmware into the router:

Connect the computer to the LAN port of the router
Set computer's IP to static, with the following configuration:

IP: 192.168.0.2
Sub-Netmask: 255.255.255.0
GW: 192.168.0.1

Connect the router via serial, and turn it on
Hold ESC to enter Realtek bootloader
Enable stop flag by issuing: (IMPORTANT, or else the firmware will not burn)
```
sig f 1
```
Enable autoburn (the bootloader will burn your FW when it receives it from TFTP):
```
autoburn 1
```
Reboot router, at this time, it should not boot into the system as stop flag is enabled
Issue:
```
ipconfig
```
Make sure "Target Address" is "192.168.0.1", this should match to your GW config above
Start a TFTP client and upload the firmware (.bin file) I use this on Windows and it works well.
After the process is complete, it will reboot into the bootloader, disable stop flag by:
```
sig f 0
```
Reboot the router, and you will be using your new firmware.

Friday, September 23, 2016

DIR-818LW Hardware

After cracked open the casing by unscrewing the 3 screws at the bottom (under the rubber anti-skid pads) and 2 screws at the top (under the adhesive plastic ring) and removing the antennas by cutting through the hot glue, the whole PCB is out.

There is only one place which got un-populated header (JP1), so I soldered a header on it:

There are two pull-up resistors on the unknown lines, which are 4.7K and 9.9K(Probably 10K) measured. My first thought is those are for the I²C, as 4.7K is always a good value for I²C pull-ups. But why the hell they put I²C on a header along with the power line? But IF those where UART, why they put pull-up resistors?

To really find out what are those lines, I have hook them up to my logic analyzer:

I have started the capture and turned the router on and I got this:

At this point, I am pretty sure it's UART as CH1 got nothing since power on, It's impossible for an I²C bus to only have SDA activities and no SCL activity or only have SCL activities and no SDA activity. So, it's definitely CH0 is Tx and CH1 is Rx for RTL8881AB.

Next, I am going to guess the baud rate. Fortunately, Saleae Logic got a UART analyzer:

Now, I just have to go through the common bit rates (Hope they use a standard bit rate...).

Finally, at 38400bps...: PROFIT!

I got some readable text and I am sure I got the right bit rate!
CLK_SEL=0x00000004,DIV=0x00000000

Now, I can explore the stock firmware of the router :D

Last but not least, let's go through some of the hardware on the router:
The PCB construction is 4 layer, with top and bottom filled with ground plane.

Front:

Back:

Power section (left hand side of the front):

5GHz RF section:

Note that the marking of the RF Frontend IC is:

SKY11
85703
526YV

The last line should be date code, where the first line SKY should be stands for Skyworks Solutions, Inc.

And that's it! I am gonna start playing with the UART!

Thursday, September 22, 2016

Hacking DIR-818LW (with RTL8881AB) with openwrt

[Click to see all post about my openwrt hacking on DIR-818LW]

Today, I have got myself a D-Link DIR-818LW (rubbish bin looked alike) router, with hardware revision B1. According to the internal photos found on FCC and information found on wikidev, it uses RTL8881AB as its SoC (with Lexra RLX5281 CPU which also found here) which includes 802.11ac. Another chip onboard RTL8192ER have provided 802.11bgn 2T2R (there are two antennas for 11n and 1 antenna for ac), and RTL8367RB for Gigabit switch.

I was hoping to install openwrt on this router, as the original firmware does not provide wifi bridge/repeater functionallity. A bit of Google search found this, which got an openwrt image for RTL8881AB and RTL8367RB.

Also, there are source code for D-Link stock firmware too! (HUGE Thanks to GPL!)
Original Firmware (v2.05b1):
ftp://ftp.dlink.eu/Products/dir/dir-818lw/driver_software/DIR-818LW_REVB_FIRMWARE_PATCH_2.05.B01.ZIP

GPL source code (v2.03b1, older than the released firmware a bit):
https://dlink-gpl.s3.amazonaws.com/GPL1400446/DIR818LW_GPL203_Readme.txt (md5: 9535d6e47e9c955f97a8cc967d41f30f)
https://dlink-gpl.s3.amazonaws.com/GPL1400446/DIR818LW_GPL203b01.tar.gz (md5: 0c9714c9da99c9c535274423de8de678)

Note that Lexra toolchains were included into the GPL source code package!

Now, I am gonna explore the hardware and I will update more :)

Wednesday, September 7, 2016

Weird "AT%IPSYS?" and "AT+CGMR" when playing with ESP8266

These few days when i am playing with my ESP8266 on Ubuntu, using a PIC32 MIPS configured as USB UART pass-through (USB UART Tx => EPS8266 Rx, and vice versa).

The problem appeared when i reset my development board, there are some weird AT commands sent out to the ESP8266 which is not programmed in my PIC32. I have also not typed any AT commands on minicom (or screen) terminal and these AT commands just came out from nowhere.

Here is some of the possible weird AT commands that sent out to my serial port (and hence to ESP8266):

AT%IPSYS?
AT+CGMR
AT

After some searching on the internet, i have found a bug report (and it's the only search result related on Google) regarding "AT%IPSYS?" on "ModemManager". At this point, I knew that weird output command were coming out from my Ubuntu!

I think the reasons of the existence of the problem might be "ModemManager" thinks my serial port, which literally connected directly to ESP8266 is an AT modem. It also responds to the "AT" command which returns "OK\r\n" afterwards. After removing "ModemManager" from my Ubuntu machine by issuing:

sudo apt-get purge modemmanager

, this issue had gone away!

Sunday, September 4, 2016

TL431 Constant Current LED Driver with control

Just want to share a constant current led driver using TL431. It uses 12V for power supply, and you can assert LOW to turn on the driver, HIGH to turn off the driver.

Sunday, July 3, 2016

Raspberry Pi OpenMAX JPEG Encoder

I have utilize the OpenMAX IL on Raspberry Pi to encode BGR888(24-bit) RAW images to JPEG images.
Compared to libjpeg-turbo (Which i use openCV imencode method), the speed up is HUGE!

I have used openCV imdecode to decode JPEG image to BGR888 raw data, and encode back to JPEG with OMX Hardware Encoder.
The program is on my GitHub now in the form of benchmark.
https://github.com/hopkinskong/rpi-omx-jpeg-encode

Wednesday, June 29, 2016

Raspberry Pi Camera openCV rendering with low latency streaming with gstreamer via RTP

This is similar to streaming the video from the camera to the client using gstreamer. Except that i have add openCV in the middle for image(video) processing.

You may read this post first for streaming the video to the client with gstreamer (no openCV).
I am using openCV 3.1.0, with RPi specific and libjpeg-turbo optimization.
You can read here to see how to build one for yourself.

Although you may use "apt-get" to get openCV 2.4.9 with no optimization if you don't want to spend the time to build it, i highly recommend you to build one for yourself as it will have considerable performance improvement.

Similar to the "no OpenCV" version, i use the similar command:

raspivid -t 0 -cd MJPEG -w 1280 -h 720 -fps 40 -b 8000000 -o - | gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! rtpjpegpay ! udpsink host=<client_ip> port=<client_port>

However, i have added one more custom program in the middle of it (also one more gst-launch and few other gstreamer plugins):

raspivid -t 0 -cd MJPEG -w 1280 -h 720 -fps 40 -b 8000000 -o - | gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! multipartmux boundary="FRAME_START" ! fdsink | ./opencv_worker | gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! rtpjpegpay ! udpsink host=<client_ip> port=<client_port>

I know it's long, but it actually quite simple. Here is how it goes:

raspivid output JPEG data to FIRST gst-launch
I used "multipartmux", which will generate "Content-Length" to the output data, easy for me to extract the JPEG frame with my custom program ("opencv_worker")
The boundary is "FRAME_START", you can modify it or just make it blank. (It is quite useless, actually)
The multipart JPEG data will feed into opencv_woker
opencv_worker extract raw JPEG frames
opencv_worker do openCV work
opencv_worker output processed frame to stout
SECOND gst-launch gets data from stdin (stdout from opencv_worker)
SECOND gst-launch broadcast JPEG video to client, same as the thing that we do without openCV
Profit!

It is notable that we can even stream two video feed at the same time, using tee, ~~to let you make the command much more longer~~ to let you have more flexibility:

raspivid -t 0 -cd MJPEG -w 1280 -h 720 -fps 40 -b 8000000 -o - | tee >(gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! rtpjpegpay ! udpsink host=<client_ip> port=<client_first_port>) | gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! multipartmux boundary="FRAME_START" ! fdsink | ./opencv_worker | gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! rtpjpegpay ! udpsink host=<client_ip> port=<client_second_port>

We inserted a "tee" command right after "raspivid", the part that "tee" responsible is to stream the RAW video ("no openCV") feed to the client, while the non-"tee" part will do the openCV processing and stream the openCV video feed to the client.

Now, here is the magic of the openCV worker:

// Required defines
#define MULTIPART_STREAM_BOUNDARY "--FRAME_START" // Custom boundary defined in the multipart stream from stdin
#define MULTIPART_STREAM_BOUNDARY_SIZE 13

// Includes
#include <iostream>
#include <string>
#include <stdio.h>
#include <string>
#include <unistd.h>
#include <pthread.h>
#include <time.h>
#include <opencv2/opencv.hpp> // openCV C++ header

using namespace std;
using namespace cv;

// Global variables
// Input JPEG frame
char* inputJPEGFrame=NULL;
size_t inputJPEGFrameSize=0;
pthread_mutex_t inputJPEGFrameMutex=PTHREAD_MUTEX_INITIALIZER;
// Output JPEG frame
char* outputJPEGFrame=NULL;
size_t outputJPEGFrameSize=0;
bool outputJPEGFrameUpdated=false;
pthread_mutex_t outputJPEGFrameMutex=PTHREAD_MUTEX_INITIALIZER;

// Fill zero to array
void arrayFillZeros(char* array, size_t arraySize) {
 for(size_t i=0; i<=arraySize-1; i++) {
  array[i]='\0';
 }
}

// Push back character into array
void arrayPushCharAtBack(char array[], size_t arraySize, char c) {
 size_t i;
 for(i=0; i<=arraySize-3; i++) { // -3: 1 for "<=", 1 for c, 1 for \0
  array[i]=array[i+1];
 }
 array[i]=c;
 array[++i]='\0';
}

// Frame processing thread
void *frameProcessingThread(void* arg) {
 // Solve the issue when inputJPEGFrameSize=0, openCV fails
 while(true) {
  pthread_mutex_lock(&inputJPEGFrameMutex);
  if(inputJPEGFrameSize > 0) {
   pthread_mutex_unlock(&inputJPEGFrameMutex);
   break;
  }
  pthread_mutex_unlock(&inputJPEGFrameMutex);
 }
 
 // Main processing loop
 while(true) {
  // Obtain a local copy of input frame first
  pthread_mutex_lock(&inputJPEGFrameMutex);
  unsigned char* processingFrame=(unsigned char*)malloc(sizeof(unsigned char)*inputJPEGFrameSize);
  memcpy(processingFrame, inputJPEGFrame, inputJPEGFrameSize);
  size_t processingFrameSize=inputJPEGFrameSize;
  pthread_mutex_unlock(&inputJPEGFrameMutex); // Release to main thread while we process this frame here
  
  // Do our processing to processingFrame here, remember to update processingFrameSize
  
  // JPEG to Mat
  //Mat rawJPEG = Mat(1, processingFrameSize, CV_8UC3, processingFrame);
  Mat imgBuf = Mat(1, processingFrameSize, CV_8UC3, processingFrame);
  Mat imgMat = imdecode(imgBuf, CV_LOAD_IMAGE_COLOR);
  free(processingFrame);
  if(imgMat.data == NULL) {
   cout << "Error when decoding JPEG frame for openCV." << endl;
   exit(-1);
  }
    
  // Process imgMat here
  
  // Mat to JPEG
  vector<uchar> buf;
  imencode(".jpg", imgMat, buf, std::vector<int>());
  processingFrame=(unsigned char*)malloc(buf.size());
  memcpy(processingFrame, &buf[0], buf.size());
  processingFrameSize=buf.size();
  
  // Output the processed frame for output
  pthread_mutex_lock(&outputJPEGFrameMutex);
  free(outputJPEGFrame);
  outputJPEGFrame=(char*)malloc(sizeof(char)*processingFrameSize);
  memcpy(outputJPEGFrame, processingFrame, processingFrameSize);
  outputJPEGFrameSize=processingFrameSize;
  outputJPEGFrameUpdated=true;
  pthread_mutex_unlock(&outputJPEGFrameMutex);
  
  // Clean up, avoid evil memory leaks plz
  free(processingFrame);
 }
 return NULL;
}

void *frameOutputThread(void* arg) {
 while(true) {
  pthread_mutex_lock(&outputJPEGFrameMutex);
  if(outputJPEGFrameUpdated) {
   write(STDOUT_FILENO, outputJPEGFrame, outputJPEGFrameSize);
   outputJPEGFrameUpdated=false;
  }
  pthread_mutex_unlock(&outputJPEGFrameMutex);
  usleep(2000);
 }
 return NULL;
}

int main(int argc, char** argv) {
 if(argc == 1) {
  // Thread creation
  pthread_t frame_processing_thread, frame_output_thread;
  pthread_create(&frame_processing_thread, NULL, frameProcessingThread, NULL);
  pthread_create(&frame_output_thread, NULL, frameOutputThread, NULL);
  usleep(3000); // Dumb method to wait thread up, just hope it doesn't add much delay to the stream, and the thread have the time to finish intiialization
  
  // Read stdin
  size_t bytesRead=0;
  char byteBuffer[1]={0x00};
  char boundaryKeywordWindow[MULTIPART_STREAM_BOUNDARY_SIZE+1]; // +1 for \0
  char contentLengthKeywordWindow[16+1]; // "Content-Length: " is 16 in length, +1 for \0
  char contentLength[8]; // 8 bytes (7 bytes long) should be enough for content length, 1280*720*3=2764800(Not JPEG compressed), only 7 bytes long
  arrayFillZeros(boundaryKeywordWindow, MULTIPART_STREAM_BOUNDARY_SIZE+1);
  arrayFillZeros(contentLengthKeywordWindow, 16+1);
  while(true) { // Main while loop
   // 1: Locate boundary keyword [This one could be removed]
   /*while(true) {
    bytesRead=read(STDIN_FILENO, byteBuffer, 1);
    if(bytesRead == 1) {
     arrayPushCharAtBack(boundaryKeywordWindow, MULTIPART_STREAM_BOUNDARY_SIZE+1, byteBuffer[0]);
     //arrayDump(boundaryKeywordWindow, MULTIPART_STREAM_BOUNDARY_SIZE+1);
    }
    if(bytesRead < 0) { // error
     cout << "Error when reading from stdin." << endl;
     exit(-1);
    }
    if(strcmp(boundaryKeywordWindow, MULTIPART_STREAM_BOUNDARY) == 0) {
     break;
    }
   }*/ // Removed to reduce delay
   // 2: Locate "Content-Length: "
   while(true) {
    bytesRead=read(STDIN_FILENO, byteBuffer, 1);
    if(bytesRead == 1) {
     arrayPushCharAtBack(contentLengthKeywordWindow, 16+1, byteBuffer[0]);
    }
    if(bytesRead < 0) { // error
     cout << "Error when reading from stdin." << endl;
     exit(-1);
    }
    if(strcmp(contentLengthKeywordWindow, "Content-Length: ") == 0) {
     break;
    }
   }
   // 3: Extract content length of the current frame
   size_t i=0;
   while(true) {
    bytesRead=read(STDIN_FILENO, byteBuffer, 1);
    if(bytesRead == 1) {
     if(byteBuffer[0] != 0x0D) {
      contentLength[i]=byteBuffer[0];
      i++;
     }else{
      contentLength[i]=0x00; // \0
      break;
     }
    }
    if(bytesRead < 0) { // error
     cout << "Error when reading from stdin." << endl;
     exit(-1);
    }
   }
   // 4: Skip the following 3 bytes (0x0A, 0x0D, 0x0A)
   for(i=0; i<=2; i++) {
    bytesRead=read(STDIN_FILENO, byteBuffer, 1);
    if(bytesRead < 0) { // error
     cout << "Error when reading from stdin." << endl;
     exit(-1);
    }
   }
   // 5: Extract JPEG frame
   ssize_t jpegBytes=atoi(contentLength);
   pthread_mutex_lock(&inputJPEGFrameMutex);
   free(inputJPEGFrame);
   inputJPEGFrame=(char*)malloc(sizeof(char)*jpegBytes);
   inputJPEGFrameSize=jpegBytes;
   if(read(STDIN_FILENO, inputJPEGFrame, jpegBytes) != jpegBytes) { // Incomplete read, or error
    cout << "Error, jpeg frame incomplete" << endl;
    free(inputJPEGFrame);
    pthread_mutex_unlock(&inputJPEGFrameMutex);
    exit(-1);
   }else{
    pthread_mutex_unlock(&inputJPEGFrameMutex);
   }
  }
 }else{
  // argc != 1
  // Error handling?
 }
}

I have used POSIX thread here to have the maximum performance.
The main while loop will get data from stdin to one of the global variable.
The processing thread will process the frame with openCV, then the output thread will output the JPEG frame to stdout.
I think the comments in the code are quite self-explanatory.

That's it, you can now have your RPi camera video to process with openCV and stream to client.

Raspberry Pi Camera low latency streaming with gstreamer via RTP

I found a way to stream video from Raspberry Pi camera to client with gstreamer with low latency (<300 ms).

I am using MJPEG here, you may use H.264, but MJPEG will be easier for me to interface with openCV later, see this post.

Updating the firmware first:

sudo rpi-update

This will get the latest RPi firmware, with latest raspivid binary for streaming.

Then, we will install gsteamer:

sudo apt-get install gstreamer1.0 gstreamer1.0-plugins-bad

The "gstreamer1.0-plugins-bad'" package is for "jpegparse" plugin for streaming MJPEG to the network.

After all set, you can start the streaming by executing:

raspivid -t 0 -cd MJPEG -w 1280 -h 720 -fps 40 -b 8000000 -o - | gst-launch-1.0 fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! rtpjpegpay ! udpsink host=<client_ip> port=<client_port>

Here is the explanation on the supplied flags/plugins:

raspivid

-t 0: Running raspivid forever, the program will not stop after certain time
-cd MJPEG: Default output data is H.264, we specify this flag to force output to MJPEG
-w 1280: Set output video width to 1280(px)
-h 720: Set output video height to 720(px)
-fps 40: Set frame rate to 40
-b 8000000: Set target bit rate to 8000000bps (8Mbps)
-o - : Data will pipe to stdout

gst-launch-1.0

fdsrc: Getting data from stdin (stdout from raspivid)
"image/jpeg,framerate=40/1": Caps for jpegparse, we tell jpegparse the frame data type is JPEG and frame rate is 40fps (matching with the one we specified in raspivid -fps flag)
jpegparse: Parse JPEG frames. As we are not sure the data from raspivid is one frame at a time, we need jpegparse to combine incoming data fragments to a frame
rtpjpegpay: Wrap the JPEG frames to RTP payload
udpsink: The RTP payload will be transmitted to the specified host and port via UDP

Now on the client (I am using Windows), you can launch gstreamer with the following command to view the video:

cd <gstreamer_binaries_directory>
gst-launch-1.0.exe udpsrc port=<client_port> ! "application/x-rtp,media=(string)video,clock-rate=(int)90000,encoding-name=(string)JPEG,a-framerate=(string)40.000000,a-framesize=(string)1280-720,payload=(int)26" ! rtpjpegdepay ! decodebin ! autovideosink

Note that you may need to modify "clock-rate", "a-framerate", "a-framesize" and "payload" according to the server (RPi). You may find these parameters when you run gst-launch-1.0 with verbose mode on the Raspberry Pi:

raspivid -t 0 -cd MJPEG -w 1280 -h 720 -fps 40 -b 8000000 -o - | gst-launch-1.0 -v fdsrc ! "image/jpeg,framerate=40/1" ! jpegparse ! rtpjpegpay ! udpsink host=<client_ip> port=<client_port>

Note that the supplied "-v" flag will turn on verbose mode. Then, you will see something similar to this:

Make sure the client run with the same caps("clock-rate", "a-framerate", "a-framesize" and "payload") with the server, or else you may not see the video properly.

When everything is all done, a window will then popup, showing the video of the camera.

Monday, June 27, 2016

openCV 3.1.0 optimized for Raspberry Pi, with libjpeg-turbo 1.5.0 and NEON SIMD support

This is a small log for myself on building openCV 3.1.0 on a Raspberry Pi 2. This should work on Raspberry Pi 3 too (but not on RPi 1 as it does not support NEON).

Getting required libraries and stuff:

I am not mentioning much here, just get the normal required dependencies for building openCV 3.1.0

Getting our core packages:

openCV 3.1.0:

wget https://github.com/Itseez/opencv/archive/3.1.0.tar.gz -O opencv.tar.gz

openCV 3.1.0 Extras:

wget https://github.com/Itseez/opencv_contrib/archive/3.1.0.tar.gz -O opencv_contrib.tar.gz

libjpeg-turbo 1.5.0:

wget https://github.com/libjpeg-turbo/libjpeg-turbo/archive/1.5.0.tar.gz -O libjpeg-turbo.tar.gz

Decompress everything:

tar xvf opencv.tar.gz
tar xvf opencv_contrib.tar.gz
tar xvf libjpeg-turbo.tar.gz

Compiling all packages:

libjpeg-turbo 1.5.0
```
cd libjpeg-turbo-1.5.0/
mkdir build
autoreconf -fiv
cd build
export CFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -fPIC -O3"
export CXXFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -fPIC -O3"
sh <path_to_the_source_code>/configure
```
The magic is at the C/CXX FLAGS, which optimize our generated binaries to tune with our CPU and use NEON Hardware FPU.
Moreover, the libjpeg-turbo itself also contains NEON SIMD instructions which speed up the JPEG en/decode process.
Now, compile libjpeg-turbo with:
```
make -j4
```
"-j4" means running the process with 4 threads. In case of any errors (such as gcc or its child processes got killed due to insufficient memory), reduce the number, or omit this flag.
Make sure you have also do a "make clean" after something went wrong and you wish to recompile.
After the process is done, install libjepg-turbo with:
```
sudo make install
```
The result binaries will be reside in /opt/libjpeg-turbo/

openCV 3.1.0 with extras
I have modified the default CMake scripts to modfiy the default NEON C/CXX FLAGS:

cd opencv-3.1.0/cmake/
nano OpenCVCompilerOptions.cmake

Go to line 27 and 28, add:

-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard

to "OPENCV_EXTRA_C_FLAGS" and "OPENCV_EXTRA_CXX_FLAGS"
Also, go to line 150, under the section of "if (ENABLE_NEON)", modify the parameter of "add_extra_compiler_option" from

-mfpu=neon

To:

-mfpu=neon-vfpv4

Now, save the file (Ctrl+O), and:

cd ../
mkdir build
cd build
export CFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard" # Notice here does not have -fPIC and -O3
export CXXFLAGS="-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard" # Notice here does not have -fPIC and -O3
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=/usr/local -DINSTALL_C_EXAMPLES=OFF -DINSTALL_PYTHON_EXAMPLES=OFF -DOPENCV_EXTRA_MODULES_PATH=<path_to_opencv_contrib-3.1.0>/modules -DBUILD_EXAMPLES=ON -DWITH_FFMPEG=OFF -DWITH_V4L=OFF -DWITH_LIBV4L=OFF -DENABLE_NEON=ON -DEXTRA_C_FLAGS=-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -DEXTRA_CXX_FLAGS=-mcpu=cortex-a7 -mfpu=neon-vfpv4 -ftree-vectorize -mfloat-abi=hard -DWITH_JPEG=ON -DBUILD_JPEG=OFF -DJPEG_INCLUDE_DIR=/opt/libjpeg-turbo/include/ -DJPEG_LIBRARY=/opt/libjpeg-turbo/lib32/libjpeg.a ..
make -j2 # You can use -j4, but my gcc crashed with -j4, you can test by yourself

I am building the openCV for C++ (and no python wrappers), no FFMPEG and V4L included in the build. You may adjust the build flags according to your needs.
After 3 to 4 hours, the build should be done, and you can install openCV with:

sudo make install
sudo ldconfig

And that's it! You have successfully built and install openCV 3.1.0, optimized for Raspberry Pi with NEON FPU, and with libjpeg-turbo 1.5.0 with NEON SIMD instructions and NEON FPU support!

Sunday, June 26, 2016

Cross-compiling x86_64 linux code on Raspberry Pi (ARM linux)

I know it's quite uesless, but there is a way to cross compile x86_64 linux code on an ARM linux (i.e. a Raspberry Pi), i have built one few months ago with crosstool-ng 1.22.0:

GCC 4.9.3, built with GCC 4.9.2-10 on Raspbian Jessie

I spent 27 hrs to compile them on my RPi 2. It maybe not that useful, but it is still quite fun playing with it.

Saturday, June 25, 2016

ESP8266 Google Form Firmware

Here is an ESP8266 AT firmware which includes an AT command for submitting data to Google Form:
https://github.com/hopkinskong/esp8266-at-firmware-googleform

There are few bugs though, hopefully i have time to fix them in the future.

It fits in to a 4M flash (which used in old versions of ESP8266 modules), all other AT commands are still the same.