I have utilize the OpenMAX IL on Raspberry Pi to encode BGR888(24-bit) RAW images to JPEG images.
Compared to libjpeg-turbo (Which i use openCV imencode method), the speed up is HUGE!
I have used openCV imdecode to decode JPEG image to BGR888 raw data, and encode back to JPEG with OMX Hardware Encoder.
The program is on my GitHub now in the form of benchmark.
https://github.com/hopkinskong/rpi-omx-jpeg-encode
Hi, First, thanks for posting this code.
ReplyDeleteI'm only interested in the "Benchmark 4" code, since that's the code that uses the accelerators.
I'm having a few problems with it (please note that I have zero experience with any of the Pi's accelerators)...
1. For some reason the nBufferSize member value (which is recalculated a few times in the code) is incorrect and causes a crash if I use an image that is 800x600; however other resolutions seem to work.
2. It seems like it only processes the first frame, remaining input data is not processed. This is only noticed if you feed the loop unique buffers instead of the same buffer (as is the case with the example code).
3. Assuming the first two issues are trivial (or perhaps I just made a mistake), how would the resources be properly released and re-initialized if I wanted to run this with different image sizes without restarting the process?
Thanks much for any thoughts you might have.
Regards,
Issue confirmed, it's because I am too lazy to calculate the slice size and directly force the encoder to set a specific buffer size (which obviously not work on 800x600, but work on 1280x720 for some reason). I will try to solve this issue and update the code ASAP.
DeleteI am not sure what do you mean "it only processes the first frame". It always one iteration delayed, which means when you first feed the BGR888 data into the encoder, it won't output JPEG frame right after it. You need to wait till the next iteration to get the encoded frame.
i.e.:
Iteration 1; Iteration 2; Iteration 3; Iteration 4; ...
BGR Frame 1; BGR Frame 2; BGR Frame 3; BGR Frame 4; ...
Empty Data.; JPG Frame 1; JPG Frame 2; JPG Frame 3; ...
For different image sizes, you should re-initialize the whole encoder. You should de-allocate the buffer, change the encoder state to idle, reconfigure the port format (as your frame size changed), and perform the initialization again.
Thanks for responding!
ReplyDeleteI've learned more since my first post, but first let me explain what I mean...
So, I took your code and adapted it to a small program I wrote to pull pictures off the Pi camera and convert them to JPEG. I basically split your main() into two functions:
encode_init() which is essentially line 385 thru 544 of your jpeg_bench.cpp
and
encode_frame() which is essentially 633 thru 672 of jpeg_bench.cpp.
The only difference is the data is written to a buffer instead of STDOUT (line 660).
I call encode_init() once, then for each incoming RGB buffer from the camera I push it to encode_frame(). What I found was it only encoded the very first frame. All following frames were just not processed. Then after digging a bit, I added
ctx.encoder_ppBuffer_out->nFlag = 0;
to the top of encode_frame() and then it processed each incoming buffer. That's the good news. The bad news is after a few hundred frames the call to OMX_FillThisBuffer() (line 667) returns an error number that says there's a hardware error (when omx_die() is called).
Any thoughts on that?
If you are playing with the camera, why don't you just use raspistill to capture a still image? You can pipe the output of raspistill to your custom program. If you want to do multiple continuous images, you might try to use raspivid with MJPEG output.
ReplyDeleteYou can read the usage of this at here:
http://hopkinsdev.blogspot.hk/2016/06/raspberry-pi-camera-low-latency.html
The application I'm writing isn't unique to a Pi, I'm just attempting to pull in this accelerated encoder to speed up things when Pi is the host.
ReplyDeleteThis accelerated encoder is unique to pi. So there is no difference you use the OMX approach versus the raspivid/raspistill approach.
DeleteI was just looking for a pi-specific alternative to using libjpeg in some established code I have.
DeleteReworking the code base to use a gstreamer pipeline isn't really an option here.
No worries... nothing critical, I just thought this would be a quick-n-easy performance enhancement.
Thanks,
OK, seems you still want to stick with the OMX JPEG encoding. Regarding your issue, what is your configured port format? Are you passing frames with the same width and height? Is it all the frames are same? (i.e. Maybe it's the problem of your few hundred-th frame)
DeleteI have encoded 1280x720 frames continuously with the same method used in the benchmark with no issue previously. Have you confirmed all the encoded frames are correct and decode-able JPEG files?
DeleteYep, no change to the format...BGR24 640x480.
DeleteAny idea why clearing the nFlags field allowed things to work for me (for a while at least)?
Are you feeding *different* frames into the encoder?
Assuming there is a problem (and its not a mistake on my side), then the jpeg_bench.cpp code as is won't show the problem because its always the same input frame.
Also, please note, if you're busy, just tell me to go away... I'd like to get this working, but its not critical.
Regardless, I appreciate your feedback...
Don't worry, If I am too busy, I will just leave you here yelling and no one knows :P
DeleteFor the reason to clear nFlags, it's because in the last iteration, the nFlags of the output buffer is set with "OMX_BUFFERFLAG_ENDOFFRAME". If you are not clearing it, the next iteration will think you have already reached end of frame and bail out the encoding loop, but in fact, it's the state of the last iteration of the frame.
And yes, I am supplying different frames into the encoder. I was used this once to encode raw BGR888 frames from openCV to JPEGs and pass them to gstreamer and stream them to my Android phone through RTP.
Try increasing your GPU memory a bit (by running sudo raspi-config, adjust GPU/MEM Split) see if it solves your issue.
BTW, Can you post your input port format definition here?
Did anybody get any further with this? I too see a number of frames ok but then it stops working. I'm not getting any errors as such but the frames returned just seem to be corrupted, they start off with the JFIF SOI Marker and then after some time just seem to be any old random values.
ReplyDeleteI've tried setting the memory split to 128Mb (this seems like loads! I have no GUI running, its a headless system.)