How to detect only/specifically human voice? - ios

I am developing an application where I shall be plotting a realtime pitch-frequency graph based on the sound produced by the speaker.
Example: User says "hmmmmmmmmmmmmmm..." and a graph is being plotted simultaneously showing the frequency reached by the user at every 1/10th of a second.
Now, I have developed everything from top to bottom but the only problem which remains is that the background noise is also being captured while a user speaks or says something. Even if the user says something keeping the phone close to his lips, noise is still being captured and plotted.
I want to remove that noise.
I have tried going about Shout ToolKit and Shpinx but nothing is being that effective as it is slowing the plotting of graph.
I am making this app using phonegap.
Are there any better noise cancellation apis available [pref: open source]

Related

Methods to track marked points in a stationary video?

Not sure where to ask this. Please redirect me if SO is not the place.
I want make a web app that accurately tracks pose in a stationary video of someone pedaling a stationary bike. The joints can be marked with some stickers to make the process easier and more accurate. Basically, I want to do what does this app.
First i tried markerless tracking using pose estimation models such as mediapipe's Blazepose and google's MoveNet. However, these are not accurate enough. I would also like to track some additional landmarks (ball of the foot,...).
Then I tried OpenCV.js's Lukas-Kanade optical flow method. But the algorithm lost the tracked point quickly. Even when i placed a colored tape on the part of the body that i wanted to track.
I also tried template matching a single marked point in opencv but it was not very robust, and it would probably not work well when using more markers.
What other methods can I try? Since the app i send the video of requires stickers to be placed, I though it is using something like Lukas-Kanade. But as I said, when I tried it, it wasn't able to track the marked point. Because the app is only on iOS I thought it may be using this API. However, this is only my speculation.
Edit: added example video: https://www.youtube.com/watch?v=eCNyyABfWSE
I tried shooting in slowmo to have more fps, but the quality suffered because of this. Also i didn't have blue or green tape so I had to use yellow, which is not very visible on the sweater or on my wrist. But the markers on the pants should be trackable right?

How to detect scrolling speed of a video/How to detect differences in images

I have some screen recording videos from which I want to extract some information. My thinking is to use cv2.VideoCapture() to get screenshots and then use OCR to get information. But there is a limit to how many times I can call OCR service(a business service). So I want to only use the critical screenshots that don't have much information overlap. For example, I got 300 screenshots from cv2 but I can already get all the information needed from 20 of them since the scrolling speed is slow and most of the screenshots are overlapped.
See a real example: I want to get all the app names in a screen recording video of AppStore.
The question is:
How can I find out the scrolling speed of the video so that I can adjust how often I capture a screenshot. Or to put it in another way: how can I find out how much the consecutive screenshots change, which actually implies the speed of scrolling?
you can use optical flow processing to detect scrolling, there will be only one dimension Y in flow detected so it will be easy to get the average scrolling by calculating the average of flows vector norm.
you can find here a python example to adapt easily in your case:
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_video/py_lucas_kanade/py_lucas_kanade.html

How to distinguish between different license plates using OpenCV

Currently working on licentiate detection system and need some guidance on how to proceed.
I can capture (via video playback) and with the help of an open source library called OpenALPR display the license plates directly to the terminal, now the issue is it capture on a frame by frame basis so it capture the same license plate multiple times. I added a frame skip variable and now it skips however many number of frames I want it to but the issue is still there.
Furthermore, I'd like to distinguish between different license plates if possible but don't know how to work around that, I've attempted employing basic object detection and detection but failed miserably.
Below is an image of the program running, as seen it detects a single license plate and display multiple instance of it, now the issue is I expect it to move on to the next car and display Plate#1, unfortunately it does not and continues feeding into Plate #0
Program Running
Program Running
The function that actually helps display the license plate text is below, really the first line does all the work. OpenALPR is a pretty powerful.
results = alpr.recognize_ndarray(frame)
for i, plate in enumerate(results['results']):
best_candidate = plate['candidates'][0]
print('Plate #{}: {:} ({:}%)'.format(i,
best_candidate['plate'].upper(),
best_candidate['confidence']))
I'd like some guidance towards how I can solve this problem? Which is basically distinguish between different license plates.
It is a general problem without general solution, because it highly depends on context. Some thoughts:
If it is a video feed you can track the plate movement, the track will "jump" when it detects another plate. Let say the maximum optical flow velocity is 100 px/frame, if it jumps more than this threshold, you can suppose it is a new plate.
Depending on you video quality and detector, may there be spurious jumps, I would add a Kalman filter or any simple filter.
Perhaps there is a minimum time lapse between a plate goes out the image and the next arrives. You can use a time threshold to trigger the "changed plate alert" event.

Interactive video in iOS : Is it possible to trigger specific actions in code by tapping discrete parts in the video?

I am asking this because I couldn't find the answer anywhere, at least using the keywords I could think.
The most relevant question/answer I've found is : (Create interactive videos in iPad - An app for product demo) . The user Jano replied:
The easiest way to create interactive videos for iOS is to use Apple's HTTP Live Streaming technology. You have to create a video, embed metadata, play it using MPMoviePlayerController or AVPlayerItem, and then display clickable areas in response to metadata notifications.
Metadata should contain coordinates for the element you are tracking, eg: a dress, and a identifier for the product. You overlay this info with a clickable subview that reveals more information about the product. There are several applications of this kind in iTunes, here is one.
Once you get a working product and weeks-time of videos, the most difficult part is to perform motion tracking with the less possible human interaction. One approach is to use Adobe After Effects, another is to code your own solution based on OpenCV.
The example I've found concerning this technology (http://vimeo.com/16455248) showed the automatic addition of NSButtons when the video reaches the meta-tags embedded. My client wants a human body interactive video that pauses at a specific time (maybe using the meta-tags) and reacts to user tapping in an element in video (e.g: imagine a pill inside stomach; after tapping this pill it triggers another pre-rendered video, in a way not transparent to user). I have thought about animations using Cocos2D or Open GL ES, but I lack people who master these technologies.
I didn't quite understand the "motion tracking" reference in the quote above. Jano mentions Adobe After Effects and OpenCV. This motion tracking is like an "UIGestureRecognizer" ? Does it track parts of the video itself or motions initiated by user, as taps ?
I expect I've exposed the question in the most clear form possible. Thank you in advance.
This question is a year old, but I can give you insight into the After Effects question. AE has a feature where you can define an area in a video frame and the software will track that area across the timeline, logging the coordinates at specific intervals. For example, in a video of a person riding a mountain bike, you could select an area around their helmet and AE will log coordinates of the helmet throughout the timeline.
Since Flash was the most likely target for interactive video, the typical workflow would encode this coordinate data into a Flash video as cue point events (this is the only method I have personally experienced). According to some googling, the data is stored in key frames and can be extracted using scripts.
More info: http://helpx.adobe.com/after-effects/using/tracking-stabilizing-motion-cs5.html
Here's a manual method for extracting the data:
In the timeline panel select the footage and press the U key, all
track points keyframes will show up. Here’s the magic, select the
Feature Center property of each track point and copy it (Cmd+C for Mac
or Ctrl+C for PC)
Now open any text editor such as TextMate or Notepad and paste the
data (Cmd+V for Mac or Ctrl+V for PC)

ios detect heart rate [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Detecting heart rate using the camera
I need the same functionality as the application Instant Heart Rate.
The basic process requires the user to:
Place the tip of the index finger gently on the camera lens.
Apply even pressure and cover the entire lens.
Hold it steady for 10 seconds and get the heart rate.
This can be accomplished by turning the flash on and watch the light change as the blood moves through the index finger.
how can i start
I'd start by using AVFoundation to turn the light on. The answers in the linked post below include examples of how to do this:
How to turn the iPhone camera flash on/off?
Then as far as detecting the light change goes, you can probably use Brad Larson's GPUImage Framework. This framework includes a couple of helpful functions that you may be able to use to achieve this, including:
GPUImageAverageLuminanceThresholdFilter
GPUImageAverageColor
GPUImageLuminosity
Using the filters listed above you should be able to measure color variations in your finger, and monitor the time intervals between the occurrences of these changes. Using this framework you may even be able to specify an arbitrary variance requirement for the color/luminosity change.
From there all you have to do is convert the time interval in between the color changes to a pulse. Here's an example of how to calculate pulse.
http://www.wikihow.com/Calculate-Your-Target-Heart-Rate

Resources