Currently I have a system of ML models that run in their own processes in Python. It works perfectly when a single video camera feed is input but now I need to feed video from multiple sources and only have resources to run one instance of the model. I tried to batch process multiple video streams but it is not really scalable for more than 5 cameras. Is there any Python framework or pipeline that could be helpful? Please suggest.
Related
What is best way to mix data and video for machine learning on single board computer like raspberry pi?
It should be very common problem. I found there is GPMF and KLV extensions for video formats. But I fail to find someone to implment that for raspberry pi, opencv, etc.
It seems I can solve the problem by mixing 2 data streams in single file and then read that file with my own code but I wonder what is canonical solution for that kind of problem.
I am looking to host 5 deep learning models where data preprocessing/postprocessing is required.
It seems straightforward to host each model using TF serving (and Kubernetes to manage the containers), but if that is the case, where should the data pre and post-processing take place?
I'm not sure there's a single definitive answer to this question, but I've had good luck deploying models at scale bundling the data pre- and post-processing code into fairly vanilla Go or Python (e.g., Flask) applications that are connected to my persistent storage for other operations.
For instance, to take the movie recommendation example, on the predict route it's pretty performant to pull the 100 films a user has watched from the database, dump them into a NumPy array of the appropriate size and encoding, dispatch to the TensorFlow serving container, and then do the minimal post-processing (like pulling the movie name, description, cast from a different part of the persistent storage layer) before returning.
Additional options to josephkibe's answer, you can:
Implementing processing into model itself (see signatures for keras models and input receivers for estimators in SavedModel guide).
Install Seldon-core. It is a whole framework for serving that handles building images and networking. It builds service as a graph of pods with different API's, one of them are transformers that pre/post-process data.
There are certain machine learning algorithms in use that takes videos files as input. If I have to pull all the videos from youtube that are associated with a certain tag and provide them as input to this algorithm, what should be my input format?
There is no format in which you can pass a video to a machine learning algorithm, since it won't understand the contents of the video.
You need to preprocess the video first, which might depend on how you have to use it. In general you can do something like converting each frame of the video to CSV (same as preprocessing an image), which you can pass to your machine learning algorithm. If you want to process your frames sequentially, you may want to use a Recurrent Neural Network. Also if the video has some audio, then just find its audio time series, and combine each part of the time series with its corresponding video frame.
I have retrained Tensorflow inception image classification model on my own collected dataset and is working fine. Now, I want to make a continuous image classifier on a live camera video. I have a raspberry pi camera for input.
Here's I/O 2017 link(https://www.youtube.com/watch?v=ZvccLwsMIWg&index=18&list=PLOU2XLYxmsIJqntMn36kS3y_lxKmQiaAS) I want to do the same as shown in the video at 3:20/8:49
Is there any tutorial to achieve this?
Step one
Put your tensorflow model aside for this first step. Follow different tutorials online like this one that show how to get an image from your raspberry pi.
You should be able to prove that your code works to yourself by displaying the images to a device or ftp'ing them to another computer that has a screen.
You should also be able to benchmark the rate at which you can capture images, and it should be about 5 per second or faster.
Step two
Look up and integrate image resizing as needed. Google and Stack Overflow are great places to search for how to do that. Again, verify that you are able to resize the image to exactly what your tensorflow needs.
Step three
copy over some of the images to your dev environment and verify that they work as is.
Step four
ftp your trained tensorflow model to the pi along with installing supporting libraries. Integrate the pieces into one codebase and turn it on.
I am trying to capture online streamed content process them image by image. I have the API's written for images in openCV in python 2.7 I am just trying to extend this and see explore different possibilities (and ofcourse choose the best method) for capturing and processing these online video streams. Can this be done in openCV? If not(or simpler) any other alternative (python alternative highly preferred)?
Thanks
Ajay