Scene Boundary Detection - machine-learning

I'm working on Scene Boundary Detection and following work i have done
Conventional Approach: In this approach I've used SSIM and Pearson Coefficient to compare similarity between two frames
Advantages
Works well on Abrupt Cut Transitions
Fast in execution
Disadvantage
- False To detect gradual transition(fade-in, fade-out, wipe,dissolve, fast blur)
Deep SBD: In this I have implemented deepSBD. Link of the code and paper is https://github.com/abramjos/Scene-boundary-detection
Advantages
Works well in Abrupt cut and Gradual Transition.(fade-in, fade-out, wipe,dissolve)
Disadvantage
Included manual flash in training due to which wrong detection increases when intensive flash appears in between Frames for eg. Wrestling Videos or sudden sunlight in between shots.
Fast Blur motion
Long Dissolve Transition
Transnet: After that we have implemented Transnet. Link of code and paper is https://github.com/soCzech/TransNet
It replaces some of the wrong detection with correct detection of gradual transition but introduces new problems:
Overlap in between frames were wrongly detected as shot change
Zoomed transition in between frames were wrongly detected as shot change
NOTE: AT Last we combined deepSBD and TransNet so None detection is very less but False detection increased because of following
Intensive Flash
Fast Motion Blur
Zoomed Transition
Overlap in between Frames
Overall Accuracy for 2-3 hours video is expected to be 95% if above problems doesn't appear and if it appears then 70%.
Is there any way to resolve False detection .

Related

Improving an algorithm for detecting fish in a canal

I have many hours of video captured by an infrared camera placed by marine biologists in a canal. Their research goal is to count herring that swim past the camera. It is too time consuming to watch each video, so they'd like to employ some computer vision to help them filter out the frames that do not contain fish. They can tolerate some false positives and false negatives, and we do not have sufficient tagged training data yet, so we cannot use a more sophisticated machine learning approach at this point.
I am using a process that looks like this for each frame:
Load the frame from the video
Apply a Gaussian (or median blur)
Subtract the background using the BackgroundSubtractorMOG2 class
Apply a brightness threshold — the fish tend to reflect the sunlight, or an infrared light that is turned on at night — and dilate
Compute the total area of all of the contours in the image
If this area is greater than a certain percentage of the frame, the frame may contain fish. Extract the frame.
To find optimal parameters for these operations, such as the blur algorithm and its kernel size, the brightness threshold, etc., I've taken a manually tagged video and run many versions of the detector algorithm using an evolutionary algorithm to guide me to optimal parameters.
However, even the best parameter set I can find still creates many false negatives (about 2/3rds of the fish are not detected) and false positives (about 80% of the detected frames in fact contain no fish).
I'm looking for ways that I might be able to improve the algorithm. I don't know specifically what direction to look in, but here are two ideas:
Can I identify the fish by the ellipse of their contour and the angle (they tend to be horizontal, or at an upward or downward angle, but not vertical or head-on)?
Should I do something to normalize the lighting conditions so that the same brightness threshold works whether day or night?
(I'm a novice when it comes to OpenCV, so examples are very appreciated.)
i think you're in the correct direction. Your camera is fixed so it will be easy to extract the fish image.
But you're lacking a good tool to accelerate the process. believe me, coding will cost you a lot of time.
Personally, in the past i choose few data first. Then i use bgslibrary to check which background subtraction method work for my data first. Then i code the program by hand again to run for the entire data. The GUI is very easy to use and the library is awesome.
GUI video
Hope this will help you.

OpenCV MOG2 Background Subtraction strange result

When using the MOG2 bacground subtractor in OpenCV I am getting results that display parts of the background in foreground segmented regions of the output. I am processing a live camera feed.
I am initialising with default paramaters and then changing the number of Gaussians to 5. For the first 100 frames when it is assumed that the scene is static, the GMM is updated with a learning rate of -1.0. After 100 frames, when movement is expected, a learning rate of 0.0 is used.
An example of the mask given with no movement is as follows:-
It can be seen that the mask is rather noisy, though that may be down to luminescence changes in the scene.
When placing my arm in front of the camera, it can be seen that the outline of my downstairs bannister is marked as background in the region that should be solidly my arm, as follows:-
Is this the expected behaviour for the MOG2 background subtractor? Is the update and then use for prediction model that I have employed sufficient to segment moving components in a static scene? Would one expect remnants of the occluded scene to appear in the segmentation mask?

Video image analysis - Detect fast movement / Ignore slow movement

I am looking to capture video on an iPhone and to initiate capture once fast motion is identified and to stop when there is slow motion or no motion is detected.
Here is a use case to illustrate:
If someone is holding the iPhone camera and there is no background movement, but his hands are not steady and moving left/right/up/down slowly, this movement should be considered slow.
If someone runs into the camera field of view quickly, this would be considered fast movement for recording.
If someone slowly walks into the camera field of view, this would be considered slow and shouldn't be picked up.
I was considering OpenCV and thought it maybe overkill using their motion detection and optical flow algorithms. I am thinking of a lightweight method by accessing the image pixel directly, perhaps examining changes in luminosity/brightness levels.
I only need to process 30-40% of the video frame area for motion (e.g. top half of screen), and can perhaps pick up every other pixel to process. The reason for a lightweight algorithm is because it will need to be very fast < 4ms as it will be processing incoming video buffer frames at a high frame rate.
Appreciate any thoughts into alternative image processing / fast motion detection routines by examining image pixels directly.
dense optical flow like calcOpticalFlowFarneback
using motion history
2.1 updateMotionHistory(silh, mhi, timestamp, MHI_DURATION);
2.2 calcMotionGradient(mhi, mask, orient, MAX_TIME_DELTA, MIN_TIME_DELTA...
2.3 segmentMotion(mhi, segmask, regions, timestamp, MAX_TIME_DELTA);
2.4 calcGlobalOrientation(orient_roi, mask_roi, mhi_roi, ...

Reduce false detection in Pedestrian Detection

I am using OpenCV sample code “peopledetect.cpp” to detect pedestrians.
The code uses HoG for feature extraction and SVM for classification. Please find the reference paper used here.
The camera is mounted on the wall at a height of 10 feet and 45o down. There is no restriction on the pedestrian movement within the frame.
I am satisfied with the true positive rate (correctly detecting pedestrians) but false positive rate is very high.
Some of the false detections I observed are moving car, tree, and wall among others.
Can anyone suggest me how to improve the existing code to reduce false detection rate.
Any reference to blogs/codes is very helpful.
You could apply a background subtraction algorithm on your video stream. I had some success on a similar project using BackgroundSubtractorMOG2.
Another trick I used is to eliminate all "moving pixels" that are too small or with a wrong aspect ratio. I did this by doing a blob/contour analysis of the background subtraction output image. You need to be careful with the aspect ratio to make sure you support overlapping pedestrians.
Note that the model you're using (not sure which) is probably trained on a front faced pedestrian and not with a 45 degrees angle down. This will obviously affect your accuracy.

Rapid motion and object detection in opencv

How can we detect rapid motion and object simultaneously, let me give an example,....
suppose there is one soccer match video, and i want to detect position of each and every players with maximum accuracy.i was thinking about human detection but if we see soccer match video then there is nothing with human detection because we can consider human as objects.may be we can do this with blob detection but there are many problems with blobs like:-
1) I want to separate each and every player. so if players will collide then blob detection will not help. so there will problem to identify player separately
2) second will be problem of lights on stadium.
so is there any particular algorithm or method or library to do this..?
i've seen some research paper but not satisfied...so suggest anything related to this like any article,algorithm,library,any method, any research paper etc. and please all express your views in this.
For fast and reliable human detection, Dalal and Triggs' Histogram of Gradients is generally accepted as very good. Have you tried playing with that?
Since you mentioned rapid motion changes, are you worried about fast camera motion or fast player/ball motion?
You can do 2D or 3D video stabilization to fix camera motion (try the excellent Deshaker plugin for VirtualDub).
For fast player motion, background subtraction or other blob detection will definitely help. You can use that to get a rough kinematic estimate and use that as an estimate of your blur kernel. This can then be used to deblur the image chip containing the player.
You can do additional processing to establish identify based upon OCRing jersey numbers, etc.
You mentioned concern about lights on the stadium. Is the main issue that it will cast shadows? That can be dealt with by the HOG detector. Blob detection to get blur kernel should still work fine with the shadow.
If you have control over the camera, you may want to reduce exposure times to reduce blur. Denoising techniques can be used to reduce CCD noise that occurs with extreme low light and dense optical flow approaches align the frames and boost the signal back up to something reasonable via adding the denoised frames.

Resources