Sound Synchronization Issues - signal-processing

Sound Synchronization Issues - signal-processing

We are going to develop a Project on Sound Source Localization using Labview. Still We are on intial stage and going to perform all task on Software base with four mic connected with PC (For initial stage, later on going to develop using NI hardware if possible).
Initially we acquireing sound from 4 Different Microphones connected with computer through USB. Here all microhpones acquiring sound from single sound source with some delay(mili seconds) beacuse of their different position. But this Sound data acquired by USB are not able to write to sound card simulteneously. This data of sound acquire some hold time while writing to the sound card and we are getting some delay samples while synchronizaing these all sounds. Is there any idea to reduce this hold time of sounds that writes the data to the sound card?
Suppose hold time 10ms, want to reduce this to the micro seconds of nano seconds.

Reducing of the hold time as well as precise inter-channel synchronisation are not possible with LabVIEW running under Windows, and regular sound acquisition hardware. Internal software delays comparable with the time slice are to be expected (~10ms).
You need at least dedicated acquisition hardware (not a number of USB sound cards), and, if you would like to have precise synchronisation of your output with the input with minimum jitter, you need NI-FPGA. Giving these requirements, I would look at the R-series

Related

Battery impact of keeping microphone on while in-app

I've been looking around for a while but couldn't find much about this. I have an AudioComponentInstance that I use to continuously record the user while in-app. This doesn't get written to a file, but I do some light processing in a recording callback. This light processing is basically an offline, lightweight voice activity detection system on every 100ms of audio data.
So essentially what I have is like the Hey Siri feature. While in-app, the microphone is always on. It waits for the user to start talking, and once the lightweight recognizer detects speech, other stuff happens.
I know this can be very battery efficient because Hey Siri is a system-wide feature. But at the same time, I have no clear idea of the impact on battery life. I only have anecdotal data – for example, the Sleep Cycle app uses 30% of battery if your phone is not charging while you sleep. So in that case, 30% battery for 8 hours of mic use. But that might be high because they're constantly doing some sort of sleep processing?
Is there a way to use Instruments or something to do an isolated battery test, or someone who has a better understanding of the microphone's impact on battery life? Thanks!

In your case, using "Hey Siri" as a comparison is not accurate because this feature relies on a dedicated SoC, specifically to optimize power usage. In your scenario, you have no choice but to consume CPU resources which will result in a higher power draw.
While further testing would be required, my assumption is that your power usage will be no better than an app in idle state at best (YMMV based on what else your app is doing).
https://machinelearning.apple.com/2017/10/01/hey-siri.html
To avoid running the main processor all day just to listen for the
trigger phrase, the iPhone’s Always On Processor (AOP) (a small,
low-power auxiliary processor, that is, the embedded Motion
Coprocessor) has access to the microphone signal (on 6S and later). We
use a small proportion of the AOP’s limited processing power to run a
detector with a small version of the acoustic model (DNN).
The acoustic model it is referring to is the trigger phrase "Hey Siri", which it has been highly optimized to detect, again circling back to power and performance considerations.

How to ensure audio rendered within time limit on iOS?

I am rendering low-latency audio from my custom synth code via the iOS Audio Unit render callback. Obviously if my rendering code is too slow then it will return from the callback too late and there will be a buffer underrun. I know from experience this results in silence being output.
I want to know what time limit I have so that I can manage the level of processing to match the device limitations etc..
Obviously the length of the buffer (in samples) determines the duration of audio being rendered and this sets an overall limit. However I suspect that the Apple audio engine will have a smaller time limit between issuing the render callback and requiring the response.
How can I find out this time limit and is that something I can do within the callback function itself?
If I happen to exceed the time limit and cause a buffer underrun, is there a notification I can receive or a status object I can interrogate?
NB: In my app I am creating a single 'output' audio unit, so I don't need to worry about chaining audio units together.

The amount of audio rendering that can be done in Audio Unit callbacks depends on the iOS device model and OS version, and well as potential CPU clock speed throttling due to temperature or background modes. Thus, it needs to be profiled on the oldest, slowest iOS device you plan on your app supporting, with some margin.
To support iOS 9, I very conservatively profile my apps on an iPhone 4S test device (ARM Cortex A9 CPU at 800 MHz), or an even older slower device by using an earlier iOS version. When doing this profiling, one can add some percentage of "make work" to test an audio callback and see if there is any margin (For a 50% margin, generate the sample buffer twice, etc.) Other developers appear to be less conservative.
This is why it is important for an mobile audio developer to have (or have access to) to several iOS devices (the older the better). If the callback meets the time limit on an old slow text device, it will very likely be more than fast enough on any newer iOS device.
Depending on the OS version, an underrun can either result in silence, or the Audio Unit stopping or crashing (which can be detected by no more or not enough callbacks within some predictable amount of time).
But the best way to avoid underrun is to do most of the heavy audio work in another thread outside the audio unit thread, and pass samples to/from the audio unit callback using a lock-free circular fifo/queue.

Adding to what hotpaw2 said, the worst performing iOS device I have encountered is the iPhone touch 16G without the rear facing camera. I have done projects where every device except the ipod touch 16G plays audio smoothly. I had to bump up the buffer duration to the next size to accommodate.
I typically have done all audio prepping prior before the render callback in a separate lockless ring buffer and keep the render callback limited to copying data. I let the application "deal" with a buffer underruns.
I personally never measured the render callback variance but I would guess that it would be consistently equal to the buffer duration time and would extremely minimal jitter (eg 5ms). I doubt it would be 4.9 ms one time then 5.1 ms the next time.
To get some timing info, in mach_time.hyou can use mach_absolute_time() to get some timing.
You didn't really say what your timing requirements are. I assume you need low latency audio. Otherwise, you can just set the buffer duration to be pretty big. I assume that you want to increase latency for slow devices using this code. I usually find what works on an iPod 16G and use that as a worst case.
NSTimeInterval _preferredDuration = ...
NSError* err;
[[AVAudioSession sharedInstance]setPreferredIOBufferDuration:_preferredDuration error:&err];
And of course, you should get the actual duration used. The OS will pick some power of two based on the sample rate:
NSTimeInterval _actualBufferDuration;
_actualBufferDuration = [[AVAudioSession sharedInstance] IOBufferDuration];
As far as adjusting for device performance. You can set the buffer duration

Creating synchronized stereo videos using webcams

I am using OpenCV to capture video streams from two USB webcams (Microsoft LifeCam Studio) in Ubuntu 14.04. I am using very simple VideoCapture code (source here) and am trying to at least view two videos that are synchronized against each other.
I used Android stopwatch apps (UltraChron Stopwatch Lite and Stopwatch Timer) on my Samsung Galaxy S3 mini to realize that my viewed images are out of sync (show different time on stopwatch).
The frames are synced maybe in 50% of the time. The frame time differences I get are from 0 to about 300ms with an average about 120ms. It seems that the amount of timeout used has very little effect on sync (same for 1000ms or 2000ms). I tried to minimize the timeout (waitKey(1) for the OpenCV loop to work at all) and read every Xth iteration of the loop - this gave worse results that waitKey(1000). I run in FullHD but lowering resolution to 640x480 had no effect.
An ideal result would be a 100% synchronized stereo video stream that has X FPS. As I said I so far use OpenCV to view video still images, but I do not mind using anything else to get the desired result (can be on Windows too).
Thanks for help in advance!
EDIT: In my search for low-cost hardware I fount that it is probably possible to do some commodity hardware hacking (link here) and inject a single clock signal into multiple camera modules simultaneously to get the desired sync. The guy who did that seems to have developed his GENLOCKed camera board (called NerdCam1) and even a synced stereo camera board that he now sells for about €200.
However, I have almost zero ability of hardware hacking. Also I am not sure if such clock injection is possible for resolutions above NTSC/PAL standard (as it seems to be an "analog" solution?). Also, I would prefer a variable baseline option where both cameras would not be soldered on a single board.

It is not possible to stereo sync two common webcams because webcams lack external trigger feature that lets one precisely sync multiple cams using a common trigger signal. Such trigger may be done both in SW or HW but the latter will give better precision. Webcams only support "free-running" mode and let you stream whatever FPS they support but you can not influence when exactly the frame integration/exposure is done.
There are USB cameras with a dedicated external trigger feature (usually scientific cameras like Point Grey) - they are more expensive (starting at about $300/piece) than webcams but can be synced. If you really are on low budget you can try to hack the PS3 Eye camera to get the ext. trigger feature.

Possible to stream video over 115kbps?

I need some advice from people experienced with streaming video.
I have a task to put together a system that allows video coming from RS-170 (composite) video cameras and have them displayed on an iPad. The catch is that no wireless (no Wi-Fi, no bluetooth) is allowed. Only a wired interface.
The physical I/O options on an iPad are apparently extremely limited, but I did manage to come across a company named Redpark that makes an RS232-to-Lightning cable. So my proposed solution is to have the video feeds go into a box with software that digitizes and encodes the video, and then sends it over RS232 to the iPad using that cable. The catch here is that the maximum bandwidth on that cable is 115kbps.
My preliminary testing of this setup on a prototype system have been less than stellar so far. I set up two PCs, each with serial ports, and hooked them together with a null modem. I then set the baud rates of the ports to 115kpbs and then attempted to stream a web cam video feed over the serial connection in real-time using ffmpeg. The results weren't very encouraging, but I at least did manage to get some sort of image to show up.
I guess I need to play around with the ffmpeg encoding options some more. But I need to ask: am I wasting my time with this idea, or should what I am asking here be possible?

For SDA LQ standard ("low quality") we encode H.264 mp4 (using x264) with a 128 kbps video track. The hardware decoding on the iPad can play it. It is maximum 320x240 30 fps video. The quality depends heavily on the material. For mostly nonmoving material, it is watchable. If there is a lot of movement or lighting changes, you may not be able to make out much. You can check out some examples at the link. Video game video, but some may be comparable to your application.
Without knowing more about your requirements (resolution, framerate, type of material), it is difficult to say more. However, given the right material, it is definitely possible to do it and have it be watchable (for some definitions of watchable).

Low jitter audio on iOS

I'd like to load a small audio clip like a beep into memory, and schedule playback after x seconds with very low jitter. My application ideally gets less than +-1ms, but +-5ms could still be useful. The time is synchronized to a remote application without a microphone. My question is what kind of jitter can I expect from the audio APIs, and are they all equal in this regard?
I'm not familiar with the audio APIs, but from the latency discussions I've seen the number 5.8ms using remoteIO audio units. Does this mean +-3ms would be the best precision possible?

You would need to set this process as Real-Time to have a guarantee of low delay, otherwise you can get jitter in seconds because operating system can decide to make some background job.
Once you got it as real-time, you might archive lower delay.
Please check with Apple if you can make process real-time (with scheduling options). You might want to have extra permissions and kernel level support in your app to do it properly, that you can have guaranteed 1ms delay for audio app.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart