I am looking for karaoke (mpeg) component for delphi 7.
Added from duplicate
I mean a component that can play mpeg files or do you want a special karaoke component that filters the voices from the music?
Have a look at Ultrastar deluxe, an open source Singstar clone based on Pascal/Delphi.
It now uses Free Pascal for portability, but afaik used Delphi originally (and maybe still for win32 target)
If you are trying to filter the vocals from a mpeg clip, then you are going to have a hard time trying to do this. The issue here is you are trying to filter out a variable frequency from the audio signal, which over time you have no idea what it is going to be. The closest thing that you may be able to achieve is some audio recordings deliberately record the voice track 90 degrees out of phase between the left and right channels, in which case you can 'cancel' the vocal track out by combining the audio with the same signal 90 degrees out of phase, but i beleive that MPEG compression will negate that anyways due to its spacial compression.
So no, i dont beleive this can be done, you will be better off trying to find the musical soundtrack and combining it with the video clip then playing this.
If you are simply trying to display text over a video clip (i.e. overlay) then you may want to look at:
Looking for a OSD component
If you also need to play video files in Delphi you can use the built in media player (TMediaPlayer) or another video component (such as TVideograbber http://www.datastead.com) - the latter supports overlay/text over screen.
Related
I'm trying to put together an open source library that allows iOS devices to play files with unsupported containers, as long as the track formats/codecs are supported. e.g.: a Matroska video (MKV) file with an H264 video track and an AAC audio track. I'm making an app that surely could use that functionality and I bet there are many more out there that would benefit from it. Any help you can give (by commenting here or—even better— collaborating with me) is much appreciated. This is where I'm at so far:
I did a bit of research trying to find out how players like AVPlayerHD or Infuse can play non-standard containers and still have hardware acceleration. It seems like they transcode small chunks of the whole video file and play those in sequence instead.
It's a good solution. But if you want to throw that video to an Apple TV, things don't work as planned since the video is actually a bunch of smaller chunks being played as a playlist. This site has way more info, but at its core streaming to Apple TV is essentially a progressive download of the MP4/MPV file being played.
I'm thinking a sort of streaming proxy is the way to go. For the playing side of things, I've been investigating AVSampleBufferDisplayLayer (more info here) as a way of playing the video track. I haven't gotten to audio yet. Things get interesting when you think about the AirPlay side of things: by having a "container proxy", we can make any file look like it has the right container without the file size implications of transcoding.
It seems like GStreamer might be a good starting point for the proxy. I need to read up on it; I've never used it before. Does this approach sound like a good one for a library that could be used for App Store apps?
Thanks!
Finally got some extra time to go over GStreamer. Especially this article about how it is already updated to use the hardware decoding provided by iOS 8. So no need to develop this; GStreamer seems to be the answer.
Thanks!
The 'chucked' solution is no longer necessary in iOS 8. You should simply set up a video decode session and pass in NALUs.
https://developer.apple.com/videos/wwdc/2014/#513
Statement of Problem:
I have a collection of sound effects in my app stored as.m4a files (AAC format, 48 KHz, 16-bit) that I want to play at a variety of speeds and pitches, without having to pre-generate all the variants as separate files.
Although the .rate property of an AVAudioPlayer object can alter playback speed, it always maintains the original pitch, which is not what I want. Instead, I simply want to play the sound sample faster or slower and have the pitch go up or down to match — just like speeding up or slowing down an old-fashioned reel-to-reel tape recorder. In other words, I need some way to essentially alter the audio sample rate by amounts like +2 semitones (12% faster), –5 semitones (33% slower), +12 semitones (2x faster), etc.
Question:
Is there some way fetch the Linear PCM audio data from an AVAudioPlayer object, apply sample rate conversion using a different iOS framework, and stuff the resulting audio data into a new AVAudioPlayer object, which can then be played normally?
Possible avenues:
I was reading up on AudioConverterConvertComplexBuffer. In particular kAudioConverterSampleRateConverterComplexity_Mastering, and kAudioConverterQuality_Max, and AudioConverterFillComplexBuffer() caught my eye. So it looks possible with this audio conversion framework. Is this an avenue I should explore further?
Requirements:
I actually don't need playback to begin instantly. If sample rate conversion incurs a slight delay, that's fine. All of my samples are 4 seconds or less, so I would imagine that any on-the-fly resampling would occur quickly, on the order of 1/10 second or less. (More than 1/2 would be too much, though.)
I'd really rather not get into heavyweight stuff like OpenAL or Core Audio if there is a simpler way to do this using a conversion framework provided by iOS. However, if there is a simple solution to this problem using OpenAL or Core Audio, I'd be happy to consider that. By "simple" I mean something that can be implemented in 50–100 lines of code and doesn't require starting up additional threads to feed data to the a sound device. I'd rather just have everything taken care of automatically — which is why I'm willing to convert the audio clip prior to playing.
I want to avoid any third-party libraries here, because this isn't rocket science and I know it must be possible with native iOS frameworks somehow.
Again, I need to adjust the pitch and playback rate together, not separately. So if playback is slowed down 2x, a human voice would become very deep and slow-spoken. And if playback is sped up 2–3x, a human voice would sound like a fast-talking chipmunk. In other words, I absolutely do not want to alter the pitch while keeping the audio duration the same, because that operation results in an undesirably "tinny" sound when bending the pitch upward more than a couple semitones. I just want to speed the whole thing up and have the pitch go up as a natural side-effect, just like old-fashioned tape recorders used to do.
Needs to work in iOS 6 and up, although iOS 5 support would be a nice bonus.
The forum link Jack Wu mentions has one suggestion, which involves overriding the AIFF header data directly. This may work, but you will need to have AIFF files since it relies on a specific range of the AIFF header to write into. This also needs to be done before you create the AVAudioPlayer, which means that you can't modify the pitch once it is running.
If you are willing to go to the AudioUnits route, a complete simple solution is probably ~200 lines (note that this assumes the code style that has one function take up to 7 lines with one parameter on each line). There is an Varispeed AudioUnit, which does exactly what you want by locking pitch to rate. You would basically need to look at the API, docs and some sample AudioUnit code to get familiar and then:
create/init the audio graph and stream format (~100 lines)
create and add to the graph a RemoteIO AudioUnit (kAudioUnitSubType_RemoteIO) (this outputs to the speaker)
create and add a varispeed unit, and connect the output of the varispeed unit (kAudioUnitSubType_Varispeed) to the input of the RemoteIO Unit
create and add to the graph a AudioFilePlayer (kAudioUnitSubType_AudioFilePlayer) unit to read the file and connect it to the varispeed unit
start the graph to begin playback
when you want to change the pitch, do it via AudioUnitSetParameter, and the pitch and playback rate change will take effect while playing
Note that there is a TimePitch audio unit which allows independent control of pitch and rate, as well.
For iOS 7, you'd want to look at AVPlayerItem's time-pitch algorithm (audioTimePitchAlgorithm) called AVAudioTimePitchAlgorithmVarispeed. Unfortunately this feature is not available on early systems.
After finally successfully finding a way to concatenate multiple voice files into one single audio file on the iPhone, I am am now trying to superimpose an audio file over the length of the voice file.
So basically I have two .m4a files:
voice.m4a which is about 10 seconds for example.
music.m4a which is about 5 seconds.
What I require is that two file be combined in such a manner that the resulting single audio file now contains the music in the background of the voice file for the length of it, so basically the resulting output should have the 10 seconds of voice and the 5seconds of music repeated twice. It is absolutely important to have a single file that contains all of this.
I am trying to get all of this done in an application on the iPhone.
Can anyone please help me out with this?
If you are looking to do that programmatically, you will need to go deeper down into CoreAudio. For a simpler solution you could use AudioQueues or for more fine grained control AudioUnits and an AUGraph. The MultiChannelMixer is the Audio Unit you are looking for. Unfortunately there is no space for an elaborate tutorial here (would take a couple of days to write just the tutorial itself), but I am hoping I could point you to the right direction.
If you decide to go down that path and want to do further audio programming then this one time simple example, then I strongly suggest you buy "Learning Core Audio, A Hands-on Guide to Audio Programming for Mac and iOS" - Chris Adamson, Kevin Avila. You can find it on Amazon, paperback or Kindle.
I have a DirectShow application written in Delphi 6 using the DSPACK component library. I want to be able to mix together audio coming from the output pins from multiple Capture Filters that are set to the exact same media format. Is there an open source or "sdk sample" filter that does this?
I know that intelligent mixing is a big deal and that I'd most likely have to buy a commercial library to do that. But all I need is a DirectShow filter that can accept wave audio input from multiple output pins and does a straight addition of the samples received. I know there are Tee Filter's for splitting a single stream into multiple streams (one-to-many), but I need something that does the opposite (many-to-one), preferably with format checking on each input connection attempt so that any attempt to attach an output pin with a different media format than the ones already added is thwarted with an error. Is there anything out there?
Not sure about anything available out of the box, however it would be definitely a third party component.
The complexity of creating this custom filter is not very high (it is not a rocket science in terms of creating such component yourself for specific need). You basically need to have all input audio converted to the same PCM format, match the timestamps, add the data and then deliver via output pin.
I'm developing a virtual instrument app for iOS and am trying to implement a recording function so that the app can record and playback the music the user makes with the instrument. I'm currently using the CocosDenshion sound engine (with a few of my own hacks involving fades etc) which is based on OpenAL. From my research on the net it seems I have two options:
Keep a record of the user's inputs (ie. which notes were played at what volume) so that the app can recreate the sound (but this cannot be shared/emailed).
Hack my own low-level sound engine using AudioUnits & specifically RemoteIO so that I manually mix all the sounds and populate the final output buffer by hand and hence can save said buffer to a file. This will be able to be shared by email etc.
I have implemented a RemoteIO callback for rendering the output buffer in the hope that it would give me previously played data in the buffer but alas the buffer is always all 00.
So my question is: is there an easier way to sniff/listen to what my app is sending to the speakers than my option 2 above?
Thanks in advance for your help!
I think you should use remoteIO, I had a similar project several months ago and wanted to avoid remoteIO and audio units as much as possible, but in the end, after I wrote tons of code and read lots of documentations from third party libraries (including cocosdenshion) I end up using audio units anyway. More than that, it's not that hard to set up and work with. If you however look for a library to do most of the work for you, you should look for one written a top of core audio not open al.
You might want to take a look at the AudioCopy framework. It does a lot of what you seem to be looking for, and will save you from potentially reinventing some wheels.