I'm totally new with audio framework. I would like to have a feature in my app.
When playing a clip/song, I want to record that song at the same time. I think there are two cases here. It's the best if I can record what is playing (the identical version).
Otherwise, if it's impossible, can I record everything (including noise from outside world) at the same time?
Finally I come up with the second approach which records everything including noise from the outside world. It works but I'm not sure it'll be approved by Apple.
Related
Suppose I have a video on YouTube that gets the URL https://www.youtube.com/watch?v=vWSyMuKkXXX (not a real video/ID, fwiw). If I delete that video, what are the chances that "vWSyMuKkXXX" will get reassigned to another video that somebody else puts up? 62^11 (is that right?) is a pretty large space from which to be assigning symbols, but YouTube must be doing some uniqueness test to avoid duplicates. The question, I guess, would then be whether they're including deleted IDs in that test (at least, given the way I'm guessing what they're doing internally).
This question is all about how much work I have to do to figure out whether the video corresponding to an ID exists and that it is the video that I think it is -- whether I can get away with using a simple call to http://www.youtube.com/oembed?... , or whether I need to get authentication and the APIs involved (which might still not resolve the question). Any thoughts? Thanks!
If YouTube were to ever reuse IDs, it would cause problems such as old links now pointing to new (possibly unlisted) videos. There is no advantage in reusing IDs, only problems including privacy problems. It would be an ugly bug.
To support unlisted videos, the IDs cannot be sequential. They must come from a large space of possible values.
how much work I have to do to figure out whether the video corresponding to an ID exists
You must send a query to YouTube.
and that it is the video that I think it is
How do you define "the video that I think it is"? By ID? By watching the transcoded video at your current resolution -- where the individual pixels might not match the uploaded pixels?
I'm trying to create an 'auto dj' application that would let smartphone users select a playlist of songs, and it would create a seamless mix for playback. There are a couple factors involved in this: read a playlist of audio files, calculate their waveforms/spectrums, determine the BPMs, and organize the compatible songs in a new playlist in the order that they will be played (based on compatible tempos & keys).
The app would have to be able to scan the waveform of a song and recognize the beginning of the 'main' part of the song (skipping slow intros/outros). I also imagine having some effects: filtering, so it can filter the bass out of the new track being mixed in, and switch the basses at an appropriate time. Perhaps reverb that the user could control as well.
I am just seeing how feasible of a project this is for 3-4 busy college students in the span of ~4 months. Not sure if it would be an Android or iOS app, or perhaps even a Windows app. Not sure what language we would use (likely Python or Java); whichever has the most useful audio analyzing libraries. Obviously it would work better for certain genres of music (house, trance), but I'd still really like to try to create this.
Thanks for any feedback
As much as I would like to hear a more experienced person's opinion on this, I would say based on your situation that it would be a very big undertaking. Since it sounds like you don't have experience using audio analyzing libraries/ programs you might want to start experimenting with those and most of them are likely going to be in C/ C++, not java/ Python. Here are some I know of but I would recommend do your own research.
http://www.underbit.com/products/mad/
http://audacity.sourceforge.net/
It doesn't sound that feasible in your situation but that just depends on your programming/project experience and motivation to create it.
Good luck
I am trying to build an app that allows the user to record individual people speaking, and then save the recordings on the device and tag each record with the name of the person who spoke. Then there is the detection mode, in which i record someone and can tell whats his name if he is in the local database.
First of all - is this possible at all? I am very new to iOS development and not so familiar with the available APIs.
More importantly, which API should I use (ideally free) to correlate between the incoming voice and the records I have in the local db? This should behave something like Shazam, but much more simple since the database I am looking for a match against is much smaller.
If you're new to iOS development, I'd start with the core app to record the audio and let people manually choose a profile/name to attach it to and worry about the speaker recognition part later.
You obviously have two options for the recognition side of things: You can either tie in someone else's speech authentication/speaker recognition library (which will probably be in C or C++), or you can try to write your own.
How many people are going to use your app? You might be able to create something basic yourself: If it's the difference between a man and a woman you could probably figure that out by doing an FFT spectral analysis of the audio and figure out where the frequency peaks are. Obviously the frequencies used to enunciate different phonemes are going to vary somewhat, so solving the general case for two people who sound fairly similar is probably hard. You'll need to train the system with a bunch of text and build some kind of model of frequency distributions. You could try to do clustering or something, but you're going to run into a fair bit of maths fairly quickly (gaussian mixture models, et al). There are libraries/projects that'll do this. You might be able to port this from matlab, for example: https://github.com/codyaray/speaker-recognition
If you want to take something off-the-shelf, I'd go with a straight C library like mistral, as it should be relatively easy to call into from Objective-C.
The SpeakHere sample code should get you started for audio recording and playback.
Also, it may well take longer for the user to train your app to recognise them than it's worth in time-saving from just picking their name from a list. Unless you're intending their voice to be some kind of security passport type thing, it might just not be worth bothering with.
I'm developing a virtual instrument app for iOS and am trying to implement a recording function so that the app can record and playback the music the user makes with the instrument. I'm currently using the CocosDenshion sound engine (with a few of my own hacks involving fades etc) which is based on OpenAL. From my research on the net it seems I have two options:
Keep a record of the user's inputs (ie. which notes were played at what volume) so that the app can recreate the sound (but this cannot be shared/emailed).
Hack my own low-level sound engine using AudioUnits & specifically RemoteIO so that I manually mix all the sounds and populate the final output buffer by hand and hence can save said buffer to a file. This will be able to be shared by email etc.
I have implemented a RemoteIO callback for rendering the output buffer in the hope that it would give me previously played data in the buffer but alas the buffer is always all 00.
So my question is: is there an easier way to sniff/listen to what my app is sending to the speakers than my option 2 above?
Thanks in advance for your help!
I think you should use remoteIO, I had a similar project several months ago and wanted to avoid remoteIO and audio units as much as possible, but in the end, after I wrote tons of code and read lots of documentations from third party libraries (including cocosdenshion) I end up using audio units anyway. More than that, it's not that hard to set up and work with. If you however look for a library to do most of the work for you, you should look for one written a top of core audio not open al.
You might want to take a look at the AudioCopy framework. It does a lot of what you seem to be looking for, and will save you from potentially reinventing some wheels.
I'm creating an admin tool for a project where I create an Event, then create multiple Speakers (on one page), then need to create multiple Talks for each Speaker.
Rather than have all the Speakers listed on one page after creation, and then put multiple Talks against each Speaker (which looks crazy due to all the input boxes), I'd like to gradually step through each Speaker, create the Talks for each Speaker, then move on to the next Speaker until all Speakers have been completed.
What's the best way to go about achieving this?
Do I need to create an array of all the created Speakers, then step through it somehow? Or set a flag on each created Speaker, so that once the user has clicked 'save talks' it finds the next speaker (in this event) that hasn't been saved?
I suggest reading this:
http://www.digitalmediaminute.com/article/1816/top-ruby-on-rails-tutorials
and afterwards:
http://www.sapphiresteel.com/How-To-Create-a-Ruby-On-Rails-Blog
after that lecture you will bea ble to solute that Problem in a "best-practise" Way.
Since your Question is a basic one i would like to show you that Tutorials.
No offense...but i think this lecture helps you more.
Further to my comment to bastianneu, I've spent about 10 minutes with AASM (http://github.com/rubyist/aasm) and have got it doing exactly what I needed.
Sometimes I guess you need to type out your question to properly clear it in you brain :)