how to equalize all ios audio streams? - ios

I decided to write an equalizer of ios, which would allow to change the level of audio frequencies to improve the audibility of sound for people with hearing problems. For example in my left ear is missing audibility of high frequencies, and I would like to be able to increase the high frequencies in all applications (skype, youtube etc), including a voice call over the cellular connection. How it could be implemented? Sorry for my bad english.

Related

bitrate quality level for specific device

I'm looking mediaconvert service from aws to transcode videos. The value I'm trying to set just now is quality level (QL) for QVBR, according with this it could depends on the platform, for example for 720p/1080p resolution it proposes QL=8/9 (for TV), QL=7 (for tablet), QL=6 (for smartphone).
In fact, the app have a version for the 3 type of devices then I'm asking: I need to keep 3 versions for the same video? I want to save some money in streaming and my app has similar number of users using it in each platform, I want to save in bandwidth but providing good quality videos
Higher QVBR quality levels (QL) correspond to higher bitrates in the output.
For a large display such as a TV, a higher QVBR QL is recommended to help improve the viewer experience. But when viewing the same content on a smaller display such as a phone you may not need all of those extra bits to still have a good experience.
In general, it's recommended to create an output targeted for each of the various devices or resolutions content will be viewed on. This will help save bandwidth for the smaller devices while still delivering high quality for the larger ones.
This concept is referred to as Adaptive Bitrate (ABR) Streaming, and is a common feature of streaming formats such as HLS and DASH (among others). The MediaConvert documentation has a section on how to create ABR outputs as well: https://docs.aws.amazon.com/mediaconvert/latest/ug/video-abr-streaming-outputs.html

The sound quality of slow playback using AVPlayer is not good enough even when using AVAudioTimePitchAlgorithmSpectral

In iOS, playback rate can be changed by setting AVPlayer.rate.
When AVPlayback rate is set to 0.5, the playback becomes slow.
By default, the sound quality of the playback at 0.5 playback rate is terrible.
To increase the quality, you need to set AVPlayerItem.audioTimePitchAlgorithm.
According to the API documentation, setting AVPlayerItem.audioTimePitchAlgorithm to AVAudioTimePitchAlgorithmSpectral makes the quality the highest.
The swift code is:
AVPlayerItem.audioTimePitchAlgorithm = AVAudioTimePitchAlgorithm.spectral // AVAudioTimePitchAlgorithmSpectral
AVAudioTimePitchAlgorithmSpectral increases the quality more than default quality.
But the sound quality of AVAudioTimePitchAlgorithmSpectral is not good enough.
The sound still echoed and it is stressful to listen to it.
In Podcast App of Apple, when I set playback speed to 1/2, the playback becomes slow and the sound quality is very high, no echo at all.
I want my app to provide the same quality as the Podcast App of Apple.
Are there iOS APIs to increase sound quality much higher than AVAudioTimePitchAlgorithmSpectral?
If not, why Apple doesn't provide it, even though they use it in their own Podcast App?
Or should I use third party library?
Are there good libraries which is free or low price and which many people use to change playback speed?
I've been searching and trying to learn AudioKit and Audio Unit or even considering purchasing a third party time-stretch audio processing library to fix the quality issue of slow playback for the last 3 weeks.
Now finally I found a super easy solution.
AVPlayer can slow down audio with very good quality by setting AVPlayerItem.audioTimePitchAlgorithm
to AVAudioTimePitchAlgorithm.timeDomain instead of AVAudioTimePitchAlgorithm.spectral.
The documentation says:
timeDomain is a modest quality pitch algorithm that is less computationally intensive. Suitable for voice.
This means spectral is suitable for music. timeDomain is suitable for voice.
That's why the voice files which my app uses was echoed.
And that's why Apple's Podcasts App's slowed down audio quality is very high.
It must also uses this time domain algorithm.
And that's why AudioKit, which seems to be developed for music use, plays voice audio with bad quality.
I've encountered the same issues with increasing/decreasing speed while maintaining some level of quality. I couldn't get it to work well using Apples API's.
In the end I found that it's worth taking a look at this excellent 3rd party framework:
https://github.com/AudioKit/AudioKit
which allows you to do that and much more, in a straightforward manner.
Hope this helps

How to amplify the voice recorded from far distance

When a person speaks far away from a mobile, the voice recorded is low.
When a person speaks near a mobile, the voice recorded is high. I want to is to play the human voice in equal volume no matter how far away (not infinite) he is from the phone when the voice is recorded.
What I have already tried:
adjust the volume based on the dB such as AVAudioPlayer But
the problem is that the dB contains all the environmental sound. So
it only works when the human voice vary heavily.
Then I thought I should find a way to sample the intensity of the
human voice in the media which leads me to voice recognition. But
this is a huge topic. I cannot narrow the areas which could
solve my problems.
The voice recorded from distance suffers from significant corruption. One problem is noise, another is echo. To amplify it you need to clean voice from echo and noise. Ideally you need to do that with a better microphone, but if only a single microphone is available you have to apply signal processing. The signal processing algorithms you are interested in are:
Noise cancellation. You can find many samples on Google from simple
to very advanced ones
Echo cancellation. Again you can find many implementations.
There is no ready library to do the above, you will have to implement a large part yourself, you can look on the WebRTC code which has both noise and echo cancellation, like described in this question:
Is it possible to reduce background noise while streaming audio on the iPhone?

How can I apply guitar effects such as dive (pitch shift) or wah-wah (compression) to guitar audio samples played on an iOS app?

I am building an iOS app that allows the user to play guitar sounds - e.g. plucking or strumming.
I'd like to allow the user to apply pitch shifting or wah-wah (compression) on the guitar sound being played.
Currently, I am using audio samples of the guitar sound.
I've done some basic read-ups on DSP and audio synthesis, but I'm no expert in it. I saw libraries such as csound and stk, and it appears that the sounds they produced are synthesized (i.e. not played from audio samples). I am not sure how to apply them, or if I can use them to apply effects such as pitch shifting or wah-wah to audio samples.
Can someone point me in the right direction for this?
You can use open-source audio processing libraries. Essentially, you are getting audio samples in and you need to process them and send them as samples out. The processing can be done by these libraries, or you use one of your own. Here's one DSP-Library (Disclaimer: I wrote this). Look at the process(float,float) method for any of the classes to see how one does this.
Wah-wah and compression are 2 completely different effects. Wah-wah is a lowpass filter whose center frequency varies slowly, whereas compression is a method to equalize the volume. The above library has a Compressor class that you can check out.
The STK does have effects classes as well, not just synthesis classes (JCRev) is one for reverb but I would highly recommend staying away from it as they are really hard to compile and maintain.
If you haven't seen this already, check out Julius Smith's excellent, and comprehensive book Physical Audio Signal Processing

Analyze voice patterns IOS

I am looking for a way / library to analyze voice patterns. Say, there are 6 people in the room. I want to identify each one by voice.
Any hints are much appreciated.
Dmitry
The task of taking a long contiguous audio recording and splitting it up in chunks in which only one speaker is speaking - without any prior knowledge about the voice characteristics of each speaker - is called "Speaker diarization". You can find links to research code on the wikipedia page.
If you have prior recordings of each voice, and would rather do classification, this is a slightly different problem (Speaker recognition or Speaker identification). Software tools for that are available here (note that general purposes speech recognition packages like Sphinx or HTK are flexible enough to be coaxed into doing that).
Answered here https://dsp.stackexchange.com/questions/3119/library-to-differentiate-people-by-their-voice-timbre

Resources