Overlay Image on moving object in Video (Argumented Reality / OpenCv) - opencv

I am using FFmpeg to overlay image/emoji on video by this command -
"-i "+inputfilePath+" -filter_complex "+"[0][1]overlay=enable='between(t,"+startTime+","+endTime+")'[v1]"+" -map [v0] -map 0:a "+OutputfilePath;
But above command only overlay image over video and stays still.
In Instagram and Snapchat there is New pin feature. I want exactly same ,eg blur on moving faces or as in below videos -
Here is link.
Is it possible via FFmpeg?
I think someone with OPENCV or Argumented Reality knowledge can help in this. It is quiet similar to AR as we need to move/zoom emoji exactly where we want to on video/live cam.

Based on overlay specification:
https://ffmpeg.org/ffmpeg-filters.html#overlay-1
when you specify time interval it will happen only at that time interval:
For example, to enable a blur filter (smartblur) from 10 seconds to 3 minutes:
smartblur = enable='between(t,10,3*60)'
What you need to do is to overlay an image at specific coordinates, for example the following at fixed x and y:
ffmpeg -i rtsp://[host]:[port] -i x.png -filter_complex 'overlay=10:main_h-overlay_h-10' http://[host]:[post]/output.ogg
Now the idea is to calculate those coordinates based on the current frame of the video and force filter to use changed coordinates on every frame.
For example based on time:
FFmpeg move overlay from one pixel coordinate to another
ffmpeg -i bg.mp4 -i fg.mkv -filter_complex \
"[0:v][1:v]overlay=enable='between=(t,10,20)':x=720+t*28:y=t*10[out]" \
-map "[out]" output.mkv
Or using some other expressions:
http://ffmpeg.org/ffmpeg-utils.html#Expression-Evaluation
Unfortunately this will require to find a formula before using those limited expressions of cat moving his head or drawing a pen for x and y. It can be linear, trigonometric or other dependency from time:
x=sin(t)
With the free move it is not always possible.
To be more precise of finding an object coordinates to overlay something it should be possible to provide your own filter(ffmpeg is open sourced) similar to overlay:
https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_overlay.c
Calculating x and y either based on external file(where you can dump all x and y for all times if it is a static video) or do some image processing to find specific region.
Hopefully it will give you an idea and direction to move to.
It's very interesting feature.

Related

Combing effect while reading interlaced video

all. I have a very strange issue, reading the VideoCapture in OpenCV.
I have .MTS videos (MPEG2), and I read them in OpenCV using the next code:
cv2.namedWindow("frame", cv2.WINDOW_NORMAL)
cap = cv2.VideoCapture("00030.MTS")
while(cap.isOpened()):
ret,frame = cap.read()
cv2.imshow("frame", frame)
cv2.waitKey(0)
And this shows me the corrupted frames (with bands on them). The same quality is kept, if I save the frame as the image, and look at It outside the OpenCV.
But how It should look like:
I've never seen this before while working with .avi or .mp4
How can I get the same not-corrupted frames like in the media player?
(I edited the title of the question. Some of this information wasn't apparent originally.)
Your file names suggest that this video material is a video camera's own recording, with no alterations to it.
Your video seems to be "interlaced", i.e. not "progressive". Interlaced video consists of "fields" instead of complete frames. A field contains all the even or odd lines of an image. Even and odd fields follow each other. With that method, you can have "50i" video that updates at 50 Hz, yet requires only half the data of a full "50p" video (half the data means reduced vertical resolution).
OpenCV (probably) uses ffmpeg to read your video file.
Both ffmpeg and VLC know when a video file contains interlaced video data. VLC automatically applies a suitable filter to make this data nicer to look at. ffmpeg does not, because filtering costs processing time and changes the data.
You should use ffmpeg to "de-interlace" your video files. I would suggest the yadif filter.
Example:
ffmpeg -i 00001.MTS -c copy -vf yadif -c:v libx264 -b:v 24M 00001_deinterlaced.MTS
You should look at the settings/options of yadif. Maybe the defaults aren't to your liking. Try yadif=1 to get field-rate progressive video (50/60p).

how to add beat and bass effect in video using ffmpeg command?

i want beat effect on video and i am using ffmpeg command for beat effect i was used this below command for beat effect black and white and original color after 2 sec looping but not this work this command only create black and white video ffmpeg -i addition.mp4 -vf hue=s=0 output.mp4
So please, suggest any solution.
I want make video like youtube.com/watch?v=7fG7TVKGcqI plaese suggest me
Thanks in advance
ffmpeg -i addition.mp4 -vf hue=s=0 output.mp4 will, as you said, just create a black video. vf is video filters and hue=s=0 is setting the hue and saturation to 0.
As far as I know, this kind of effect is way too advanced for a command line application unless you have a lot of knowledge on it already. I'd recommend using a graphical video editor. I use shotcut and I like it, but I'm not sure if you can do this in it.

FFmpeg convert video to images with complex logic

I'm trying to use FFMPEG in order to solve some complex logic on my videos.
The business logic is the following:
I get videos from the formats: avi, mp4, mov.
I don't know what is the content in the video. It can be from 1M to 5G.
I want to output a list of images from this video with the higher quality I can get. Capturing only frames that have big changes from their previous frame. (new person, a new angle, big movement, etc)
In addition, I want to limit the number of frames per second that even if the video is dramatically fast and has changed all the time it will not produce more frames per second than this parameter.
I'm using now the following command:
./ffmpeg -i "/tmp/input/fast_movies/3_1.mp4" -vf fps=3,mpdecimate=hi=14720:lo=7040:frac=0.5 -vsync 0 -s hd720 "/tmp/output/fast_movies/(#%04d).png"
According to my understanding it doing the following:
fps=3 - first cut the video to 3 frames per second (So that is the limit I talked about)
mpdecimate - filter frames that doesn't have greater changes than the thresholds I set.
-vsync 0 - sync video timestamp - I'm not sure why but without it - it makes hundereds of duplicate frames ignoring the fps and mpdecimate command. Can someone explain?
-s hd720 - set video size to
It works pretty well but I'm not so happy with the quality. Do you think I miss something? Is there any parameter in FFMPEG that I better use it instead of these ones?
You can set the frame quality by appending -qscale:v 1 to your command.
qscale stand for quality scale, v stands for video, and the range is 1 to 31. 1 being the highest quality, 31 being the lowest.

OpenCV + Linux + badly supported camera = wrong pixel format

I'm trying to grab frames from a web cam using OpenCV. I also tried 'cheese'. Both give me a pretty weird picture: distorted, wrong colors. Using mplayer I was able to figure out the correct codec "yuy2". Even mplayer sometimes would select the wrong codec ("yuv"), which makes it look just like using OpenCV / cheese to capture an image.
Can I somehow tell OpenCV which codec to use?
Thanks!
in the latest version of opencv you can set the capture format form the camera with the same fourcc style code you would use for video. See http://docs.opencv.org/modules/highgui/doc/reading_and_writing_images_and_video.html#videocapture
it may still take a bit of trial-and-error, terms like YUV, YUYV, YUY2 are used a bit loosely by the camera maker, the driver maker, the operating system, the directshow layer and opencv !
OpenCV automatically selects the first available capture backend (see here). It can be that it is not using V4L2 automatically.
Also set both -D WITH_V4L=ON and -D WITH_LIBV4L=ON if building from source.
In order to set the pixel format to be used set the CAP_PROP_FOURCC property of the capture:
capture = cv2.VideoCapture(self.cam_id, cv2.CAP_V4L2)
scapture.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'))
width = 1920
height = 1080
capture.set(cv2.CAP_PROP_FRAME_WIDTH, width)
capture.set(cv2.CAP_PROP_FRAME_HEIGHT, height)

How to recognize Text-Presence pattern in a scanned image and crop it?

Smart Cropping for Scanned Docs
Recently I took over a preservation project of old books/manuscripts. They are huge in quantity, almost 10,000 pages. I had to scan them manually with a portable scanner as they were not in a condition to be scanned in an automated book scanner.
The real problem shows up when I start editing them in Photoshop. Note that all of them are basically documents (in JPG format) and that there are absolutely no images in those documents. They are in a different language (Oriya) for which I am sure there won't be any OCR software available in near future. (If there is please let me know.)
To make those images (docs) look clean and elegant I have to crop them, position them, increase contrast a bit, clean unnecessary spots with eraser, et cetera. I was able to automate most of these processes in Photoshop, but cropping is the point where I am getting stuck. I can't automate cropping as the software can't recon the presence of text or content in a certain area of that img (doc); it just applies the value given to it for cropping.
I want a solution to automate this cropping process. I have figured out an idea for this, I don't know if it's practical enough to implement and as far as I know there's no software present in market that does this kind of thing.
The possible solution to this: This might be possible if a tool can recognize the presence of text in an image (that's not very critical as all of them are normal document images, no images in them, no patterns just plain rectangles) and crop it out right from the border of those text from each side so it can output a document image without any margin. After this rest of the tasks can be automated using Photoshop such as adding white spaces for margin, tweaking with the contrast and color make it more readable etc.
Here is an album link to the gallery. I can post more sample images if it would be useful - just let me know.
http://imageshack.us/g/1/9800204/
Here is one example from the bigger sample of images available through above link:
Using the sample from tinypic,
with ImageMagick I'd construct an algorithm along the following lines:
Contrast-stretch the original image
Values of 1% for the the black-point and 10% for the white-point seem about right.
Command:
convert \
http://i46.tinypic.com/21lppac.jpg \
-contrast-stretch 1%x10% \
contrast-stretched.jpg
Result:
Shave off some border pixels to get rid of the dark scanning artefacts there
A value of 30 pixels on each edge seems about right.
Command:
convert \
contrast-stretched.jpg \
-shave 30x30 \
shaved.jpg
Result:
De-speckle the image
No further parameter here. Repeat process 3x for better results.
Command:
convert \
shaved.jpg \
-despeckle \
-despeckle \
-despeckle \
despeckled.jpg
Result:
Apply a threshold to make all pixels either black or white
A value of roughly 50% seems about right.
Command:
convert \
despeckled.jpg \
-threshold 50% \
b+w.jpg
Result:
Re-add the shaved-off pixels
Using identify -format '%Wx%H' 21lppac.jpg established that the original image had a dimension of 1536x835 pixels.
Command:
convert \
b+w.jpg \
-gravity center \
-extent 1536x835 \
big-b+w.jpg
Result:
(Note, this step was only optional. It's purpose is to get back to the original image dimensions, which you may want in case you'd go from here and overlay the result with the original, or whatever...)
De-Skew the image
A threshold of 40% (the default) seems to work here too.
Command:
convert \
big-b+w.jpg \
-deskew 40% \
deskewed.jpg
Result:
Remove from each edge all rows and colums of pixels which are purely white
This can be achieved by simply using the -trim operator.
Command:
convert \
deskewed.jpg \
-trim \
trimmmed.jpg
Result:
As you can see, the result is not yet perfect:
there remain some random artefacts on the bottom edge of the image, and
the final trimming didn't remove all white-space from the edges because of other minimal artifacts;
also, I didn't (yet) attempt to apply a distortion correction to the image in order to fix (some of) the distortion. (You can get an idea about what it could achieve by looking at this answer to "Understanding Perspective Projection Distortion ImageMagick".)
Of course, you can easily achieve even better results by playing with a few of the parameters used in each step.
And of course, you can easily automate this process by putting each command into a shell or batch script.
Update
Ok, so here is a distortion to roughly rectify the deformation.
*Command:
convert \
trimmmed.jpg \
-distort perspective '0,0 0,0 1300,0 1300,0 0,720 0,720 1300,720 1300,770' \
distort.jpg
Result: (once more with the original underneath, to make direct visual comparison more easy)
There is still some portion of barrel-like distortion in the image, which can probably be removed by applying the -barrelinverse operator -- we'd just need to find the fitting parameters.
We addressed many "smart cropping" issues in our open-source DjVu->PDF converter. The converter also allows you to load a set of scanned images instead of DjVu (just press SHIFT with Open command) and output a resulting set of images instead of PDF.
It is a free cross-platform GUI tool, written in Java.
One technique to segment text from the background is the Stroke Width Transform. You'll find several posts here on Stack Overflow about it, including this one:
Stroke Width Transform (SWT) implementation (Java, C#...)
If the text shown in the Wikipedia page is representative of written Oriya, then I'm confident that the SWT (or your customized version of it) will perform well. You may still have to do some manual tweaking after you review an image, but an SWT-based method should do a lot of the work for you.
Although the SWT may not identify every single stroke, it should give you a good estimate of the dimensions of the space occupied by strokes (and characters). The simplest method
A newish algorithm that might work for you is "content-aware resizing" algorithms such as "seam carving," which automatically removes paths of pixels of low information content (e.g. background pixels). Here's a video about seam carving:
http://www.youtube.com/watch?v=qadw0BRKeMk
There's a seam carving plugin ("liquid resizing") for GIMP:
http://liquidrescale.wikidot.com/
This blog post reports a plugin for Photoshop:
http://wordpress.brainfight.com/195/photoshop-cs5-content-aware-aka-seam-carving-aka-liquid-resize-fun-marketing/
For an overview of OCR techniques, I recommend the book Character Recogntion Systems by Cheriet, Kharma, Liu, and Suen. The references in that book could keep you busy for quite some time.
http://www.amazon.com/Character-Recognition-Systems-Students-Practitioners/dp/0471415707
Finally, consider joining the Optical Character Recognition group on LinkedIn to post more specific questions. There are academics, researchers, and engineers in the industry who can answer questions in great detail, and you might also be able to make contact via email with researchers in India who are developing OCR for languages similar to Oriya, though they may not have published the software yet.

Resources