how to read video file and split it into frames for android - opencv

My goal is as follows: I have to read in a video that is stored on the sd card, process it frame for frame and then store it in a new file on the SD card again,In each image to do image processing.
At first I wanted to use opencv for android but I did not seem to be able to read the video
here.

I am guessing you already know that doing this on a mobile device or any compute limited devices is not ideal, simply because video manipulation is very computer intensive which translates to slow execution and heavy battery usage on many devices. If you do have the option to do the processing on the server side it is definitely worth considering.
Assuming that for your use case you need to do it on the mobile device, then OpenCV on Android will now allow you to read in a video and access each frame - #StephenG mentions this in his answer to the question you refer to above.
In the past, functionality like this did not get ported to the Android OpenCv as the guidance was to use ffmpeg for frame grabbing on Android devices.
According to more recent documentation, however, this should be available for Android now using the VideoCapture class (note I have not used this myself...):
http://docs.opencv.org/java/2.4.11/org/opencv/highgui/VideoCapture.html
It is worth noting that OpenCV Android examples are all currently based around Eclipse and if you want to use Studio, getting things up an running initially can be quite tricky. The following worked for me recently, but as both studio and OpenCV can change over time you may find you have to do some forum hunting if it does not work for you:
https://stackoverflow.com/a/35135495/334402
Taking a different approach, you can use ffmpeg itself, in a wrapper in Android, for tasks like this.
The advantage of the wrapper approach is that you can use all the usual command line syntax and there is a lot of info on the web to help you get the right parameters.
The disadvantage is that ffmpeg was not really designed to be wrapped in this way so you do sometimes see issues. Having said that it is a common approach now and so long as you choose a well used wrapper library you should at least have a good community to discuss any issues you come across with. I have used this approach in a hand crafted way in the past but if I was doing it again I would use one of the popular examples such as:
https://github.com/WritingMinds/ffmpeg-android-java

Related

A-Frame: FOSS Options for widely supported, markerless AR?

A-Frame's immersive-ar functionality will work on some Android devices I've tested with, but I haven't had success with iOS.
It is possible to use an A-Frame scene for markerless AR on iOS using a commercial external library. Example: this demo from Zapworks using their A-Frame SDK. https://zappar-xr.github.io/aframe-example-instant-tracking-3d-model/
The tracking seems to be no where near as good as A-Frame's hit test demo (https://github.com/stspanho/aframe-hit-test), but it does seem to work on virtually any device and browser I've tried, and it is good enough for the intended purpose.
I would be more than happy to fallback to lower quality AR mode in order to have AR at all in devices that don't support immersive-ar in browser. I have not been able to find an A-Frame compatible solution for using only free/open source components for doing this, only commercial products like Zapworks and 8th Wall.
Is there a free / open source plugin for A-Frame that allows a scene to be rendered with markerless AR across a very broad range of devices, similar to Zapworks?
I ended up rolling my own solution which wasn't complete, but good enough for the project. Strictly speaking, there's three problems to overcome with getting a markerless AR experience on mobile without relying on WebXR:
Webcam display
Orientation
Position
Webcam display is fairly trivial to implement in HTML5 without any libraries.
Orientation is already handled nicely by A-FRAME's "magic window" functionality, including on iOS.
Position was tricky and I wasn't able to solve it. I attempted to use the FULLTILT library's accelerometer functions, and even using the readings with gravity filtered out I wasn't able to get a high enough level of accuracy. (It happened that this particular project did not need it)

Unity3D - OCR Number Recognition

Our initial use case called for writing an application in Unity3D (write solely in C# and deploy to both iOS and Android simultaneously) that allowed a mobile phone user to hold their camera up to the title of a magazine article, use OCR to read the title, and then we would process that title on the backend to get related stories. Vuforia was far and away the best for this use case because of its fast native character recognition.
After the initial application was demoed a bit, more potential uses came up. Any use case that needed solely A-z characters recognized was easy in Vuforia, but the second it called for number recognition we had to look elsewhere because Vuforia does not support number recognition (now or anywhere in the near future).
Attempted Workarounds:
Google Cloud Vision - works great, but not native and camera images are sometime quite large, so not nearly as fast as we require. Even thought about using the OpenCV Unity asset to identify the numbers and then send multiple much smaller API calls, but still not native and one extra step.
Following instructions from SO to use a .Net wrapper for Tesseract - would probably work great, but after building and trying to bring the external dlls into Unity I receive this error .Net Assembly Not Found (most likely an issue with the version of .Net the dlls were compiled in).
Install Tesseract from source on a server and then create our own API - honestly unclear why we tried this when Google's works so well and is actively maintained.
Has anyone run into this same problem in Unity and ultimately found a good solution?
Vuforia on itself doesn't provide any system to detect numbers, just letters. To solve this problem I followed the next strategy (just for numbers near of a common image):
Recognize the image.
Capture a Screenshot just after the target image is recognized (this screenshot must contain the numbers).
Send the Screenshot to an OCR web-service and get the response.
Extract the numbers from the response.
Use these numbers to do whatever you need and show AR info.
This approach solves this problem, but it doesn't work like a charm. Their success depends on the quality of the screenshot and the OCR service.

Direct2D versus Direct3D for digital video rendering

I need to render video from multiple IP cameras into several controls within the client application.
On top of the video, I should be able to add some OSD such as timestamp and camera name.
What I'm trying to do has nothing to do with 3D since we're talking about digital video with some text on it.
Which API is more suitable for this purpose? Direct3D or Direct2D?
Performance should also be a consideration here.
It used to be that Direct2D was a poor choice for Windows Phone (if you care about that system) because it wasn't supported, but Win Phone 8.1 has it now, so less of an issue.
My experience with D2D was that it offered fast, high quality 2D rendering, and I would say it is a good choice.
You might want to take a look at this article on Code Project. That looks appropriate for your purposes.
If you are certain you only need MS system support, then you're all set.
Another way to go would be a cross platform system like nanovg, which offers nice 2D rendering and would work on a Mac. Of course, you'd need to figure out how to do the video part on non windows systems.
Regarding D3D, you could certainly do it that way, but my guess would be it would make some things trickier to do. Don't forget you can combine the two as well...

Programming screen recorder - output issues

I want record screen (by capturing 15 screenshots per second). This part I know how to do. But I don't know how to write this to some popular video format. Best option which I found is write frames to separated PNG files and use commandline Mencoder which can convert them to many output formats. But maybe someone have another idea?
Requirements:
Must be multi-platform solutions (I'm using Free Pascal / Lazarus). Windows, Linux, MacOS
Exists some librarys for that?
Could be complex commandline application which record screen for me too, but I must have possibility to edit frames before converting whole raw data to popular video format
All materials which could give me some idea are appreciated. API, librarys, anything even in other languages than FPC (I would try rewrite it or find some equivalent)
I considered also writting frames to video RAW format and then use Mencoder (he can handle it) or other solution, but can't find any API/doc for video RAW data
Regards
Argalatyr mentioned ffmpeg already.
There are two ways that you can get that to work:
By spawning an new process. All you have to do is prepare the right input (could be a series of jpeg images for example), and the right commandline parameters. After that you just call ffmpeg.exe and wait for it to finish.
ffmpeg makes use of some dll's that do the actual work. You can use those dll's directly from within your Delphi application. It's a bit more work, because it's more low-level, but in the end it'll give you a finer control over what happens, and what you show the user while you're processing.
Here are some solutions to check out:
FFVCL Commercial. Actually looks quite good, but I was too greedy to spend money on this.
Open Source Delphi headers for FFMpeg. I've tried it, but I never managed to get it to work.
I ended up pulling the DLL wrappers from an open source karaoke program (UltraStar Deluxe). I had to remove some dependencies, but in the end it worked like a charm. The relevant (pascal) code can be found here:
http://ultrastardx.svn.sourceforge.net/viewvc/ultrastardx/trunk/src/lib/ffmpeg-0.10/
There was some earlier discussion with a Delphi component here. It's a very simple component that sometimes generates some weird movies. Maybe a start.

Automated Webcam Application / Hardware Problems

I am starting to develop an automated webcam application. The goal is to automatically take pictures, do some image processing and then upload the results to a FTP site. All of these tasks seem simple.
However, I am having a hard time to find a decent camera. I don't want to use a simple webcam or hd-webcam because the image quality of still frames isn't very good.
I'm also having a hard time finding an affordable digital camera supporting USB snapshot or control.
My second concern is the development itself. I'm not quite sure which programming language to use. I have experience with AS3, Processing, Java and some simple C++ and Open CV.
Do you have a clue?
Regarding the camera, There are pretty good webcams that you can find, some with HD quality. look at the cameras on Logitech (I tested their API and it is quite good), A HD camera has a retail of $99 which is very cheap. If you are looking for something better I would go with Nikon as they also have a pretty good API for C#/C++. You can get a basic SLR with simple 28mm lens for $500. Don't use a PowerShot as Nikon stopped supporting their API. Whatever camera you decide to buy make sure a proper API is available, is being maintained and free.
Regarding development, I would go with C#/Java as they are easier than C++. There are quite allot of libraries for image processing for C#/Java, just make sure that the Camera comes with an API the fits your chosen language.
Good luck.
Generally (from experience) most USB cameras that show up as an imaging device through Windows can be used with JAI [Java Advanced Imaging]. Additionally [on the .net/c++ side], the same cameras can be used through DirectShow as a capture device. Java/C# will make development easier but expect to loose some performance [even with the best of optimizations]. Additionally you can only perform upto the speed of the camera and the data line running from the camera to the computer [USB1.0 will seriously limit a decent framerate]
first get the image in RAM:
If you are using CHDK, I suggest you get the image copied from camera memory to RAM by using supported scripting languages by CHDK - you can take help from the CHDK forum http://chdk.setepontos.com/index.php for this.
or if thats difficult you can continuously copy the image to hard disk and load in RAM from there. (you need to take care (delete) of massive images accumulated on hard disk in a short period of time !)
This sounds like a 'brute force' approach, but will get your work going while you are researching correct approach.
perform image processing:
once the image is in RAM, you can apply your image processing algorithms as usual e.g. using opencv library.
hope this helps you

Resources