The problem is that i want to compare two sounds together and check how much similarity is in between two Voice/Sounds.
Example : There is "A" pronunciation already stored in app and the user say/record "A" in his/her voice and then we compare both and give result how much % they are same.
I searched GitHub and stack overflow for answer but didn't got any authentic and proper solution for this.
Can anybody share any library or code snippet for help.
Thanks in Advance
Related
I understand so far that AKSampler was recently rewritten and this GitHub project seems to be the defacto guide on the new AKSampler. What I can gather is a move toward SFZ format. I am new to the sampling world but in my application I only need a handful of samples recorded from my piano in order for it to work. As I have looked around with existing SFZ formats and samples, I do not need all of the complexity and features that SFZ provides.
I am currently using AKSampler with a single piano sample which works perfectly, however it gets a bit weird once I play anything too far from the original source, so I just want to fill in the gaps with a few other samples (I only need to play around an octave and a half with my current app).
I do see according to the Docs a couple methods buildSimpleKeyMap() and buildKeyMap() however there is no implementation currently.
Do I have any additional options? I know that EXS format has been deprecated, as well as SoundFont. Is the only way to map multiple samples to AKSampler currently using SFZ?
Thanks for all your help <3
Edit: This readme on the AKSampler GitHub page provides the breakdown for samples. I still only see SFZ being considered. If anyone else is lost with my question or needs a reference, this seems to be the best resource. If the current AKSampler only offers SFZ as the primary way to map multiple samples, so be it, however it does look very challenging, I'm really hoping there is some simple middle ground between only using a single sample for the AKSampler vs. a full bore SFZ file.
Edit 2: Getting a solution to this, will update as soon as possible, thanks for your patience!
I have provided a simple explainer and sample file in the AudioKit docs. Hope this helps new users of AudioKit!
I've been researching in any forums about this problem that I'm facing, I believe be getting close to a fix and so, decided to ask in here for help and also to help any other one who needs this topic.
The problem involves that language in SKRouteAdvices. When retrieved through
SKRoutingService.sharedInstance().routeAdviceListWithDistanceFormat(.Metric)
an array of SKRouteAdvices was retrieved, but all of the advices were written in english, the voice was in portuguese but the .adviceInstruction was in english. I tried to set the advisorSettings (as I should anyway), it didn't work, but, for some unknown reason, when I set to TTS instead of pre-recorded audios, the advices were written portuguese but an weird voice (TSS) was in it instead the pre-recorded, as expected, actually. Then, tired of trying to find an obvious fix, decided to first do this, retrieve the portuguese advices, save in an array and then do it again but as did before to get the pre-recorded voice.
Turns out, the framework has some hidden problem with it, I tried a couple of different ways to get to it but the best I got was the result I wanted but with a 50% chance of crash, I really don't know why but sometimes it just did crash. So then I tried to do the TTS again but trying to getting the pre-recorded voices with the adviceInstruction property. It comes in portuguese and all the audios files are named in english so yes, and it doesn't work either.
Resuming everything: I need the SKRouteAdvices from my advices come in portuguese instruction and also in a pre-recorded voice. Any clue?
I give up trying to find a native way to get it, I followed Sylvia's suggest but I already did that before, I manage to get the result the I wanted by calling start navigation twice. In the first attempt I specify the advisorType (in SKAdvisorConfiguration in SKRoutingService.sharedInstance()) to .TextToSpeech, then, I grab the portuguese instructions and save in to a array and proceed to the second step, I repeat the configuration route and navigation with advisorType set to .AudioFiles.
With this strange combination I got what I wanted.
The text instructions are generated based on the config files (for full details see http://sdkblog.skobbler.com/advisor-support-text-to-speech-scout-audio/ and http://sdkblog.skobbler.com/advisor-support-text-to-speech-faq/)
The bottom line is that due to how the audio files (.mp3) are linked together the text advices generated when using the "audio" option will not be "human readable".
For TTS support the advices meant to be read by a voice, hence they are "human readable".
Right now you cannot have both "mp3" advices and human understandable text instructions at the same time.
I am trying to capture the on-screen activity of my app as a video (one that I can save/upload to Youtube).
There are many others who want to do this. Although the answers are generally sparse, there's no in-depth explanation of how to do this or why it can't be done.
There's a paid (and possibly sketchy?) option here.
There's this related, but again, not totally clear SO answer about taking lots of screenshots: link.
There's a Smule app called MadPad HD that "records" the user's actions and stitches them together (but it doesn't actually capture the screen, it just stitches actions together). Here's the output of a stitching: link.
My questions are as follows:
Is capturing the video output of the screen and turning it into video actually possible?
If not, is taking lots of screenshots and turning them into video feasible (performance-wise)?
If 1 and 2 are not true, is this impossible because of device constraints or because Apple doesn't want it?
Thanks!
Perhaps you've already found an answer for this, but I thought I'd answer if anyone else is interested: with iOS 9 this will, of course, be possible through the new ReplayKit by Apple. But if you need it sooner (and with backwards compatibility) there are a couple of alternatives that I know of: Kamcord and Everyplay. Both lets your users record video and share through multiple channels, YouTube included. Both should be Sprite-Kit compatible and easily integrated (at least according to their websites!). Hope this helps!
I was assigned a project (in school) for automated multiple choice test scoring and I do not know where to start.
I think his is a kind of popular program and you already know about it. Enter an image file scanned of the answer sheet and return results.
Everything I know about computer vision is a few examples of photo editing with OpenCV. I hope you can give me a few keywords related to the problem or maybe a couple of blog articles, documents and related libraries.
Is there any free open source programs that I can refer to?
Thanks!
Edit: Add 2 example of the answer sheet (sory that I cannot find a sheet in English):
I think there are basically two steps to the problem
bring the form into a normalized position
now you know where the boxes are and can look at them by thresholding the gray values in that region.
What methods to use for step 1 depends on your actual images and how much the vary. Do you have some example images you can upload?
Also I think it is a good idea, especially if you are a beginner, to start with some simple examples and work your way up from there by adding more and more variation.
I am a bit confused about this topic. I have done a little object detection for the video.
Should I summarize according to the objects detected in the video or should I extract key frames from the video that give a good idea about the content ?
I did search for this on the internet ... I found this
But still I want to know how should I proceed ?
Thanks!
There is no answer to that question, this is an open research topic! Pr. Bernard Merialdo from France has been studying this topic for several years. You can have a look at his research group page and publications.