Creating a rtsp client for live audio and video broadcasting in objective C - ios

I am trying to create a RTSP client which live broadcast Audio and Video. I modified the iOS code at link http://www.gdcl.co.uk/downloads.htm and able to broadcast the Video to server properly. But now i am facing issues in broadcasting the audio part. In the link example the code is written in such a way that it writes the Video data to file and than reads the data from the file and upload the NALU's video packets to RTSP server.
For Audio part i am not sure how to proceed on it. Right now what i have tried is that get the audio buffer from mic and than broadcast it to the server directly by adding RTP headers and ALU.. but This approach is not properly working as Audio starts lagging behind and lag increases with time. Can someone let me know if there is some better approach to achieve this and with lip sycn audio/video.

Are you losing any packets on the client? If so, you need to leave "space." If you receive packet 1,2,3,4,6,7, You need to leave space for the missing packet (5).
The other possibility is a what is known as a clock drift problem. The clock (crystal) on your client and server are not perfectly in sync with each other.
This can be caused by environment, temperature changes, etc.
Let's say in a perfect world your server is producing audio samples 20ms audio samples at 48000 hz. Your client is playing them back using a sample rate of 48000 hz. Realistically your client and server are not exactly 48000hz. Your server might be 48000.001 and your client might be 47999.9998. So your server might be delivering faster than your client or vise versa. You would either consume packets too fast and under run the buffer or lag too far behind and overflow the client buffer. In your case, it sounds like the client is playing back too slow and slowly lagging behind the server. You might only lag a couple milliseconds per minute but the issue will keep continuing and it will look like a 1970s lip synced Kung Fu movie.
In other devices, there is often a common clock line to keep things in sync. For example, Video camera clocks, midi clocks. multitrack recorder clocks.
When you deliver data over IP, there is no common clock shared between a client and server. So your issue concerns syncing clocks between disparate devices with no. I have successfully solved this problem using this general approach:
A) Let the client count the rate of packets that come in over a period of time.
B) Let the client count the rate that the packets are consumed (played back).
C) Adjust the sample rate of the client based on A and B.
So your client requires that you adjust the sample rate of the playback. So yes you play it faster or slower. Note that the playback rate change will be very very subtle. You might set the sample rate to be 48000.0001 hz instead of 48000 hz. The difference in pitch would be undetectable by humans as it would only cause a fraction a cent difference in pitch. I gave an explanation of a very simplified approach. There many other nuances and edge cases that must be considered when developing such a control system. You don't just set it and forget it. You need a control system to manage the playback.
An interesting test to demonstrate this is to take two devices with the exact same file. A long recording (say 3 hours) is best. Start them at the same time. After 3 hours of playback, you will notice that one is ahead of the other.
This post explains that it is NOT a trivial task to stream audio and video.

Related

How to determine if two devices are listening to audio at the same time

If I have two devices listening to audio and sending data to a server, is there a way for me to align the data (at the server level) based on audio so I know when the devices were listening at the same time? The audio would be scheduled so the only thing to really account for I guess would be cable network/timezone issues.
I've been looking at things like FFT and other questions related to sound but realize that I may be chasing the wrong problem or over engineering. Would it be best to try and compare frequency or use a solution like this question suggests?
Most Audio API's on iOS have callbacks, especially the low-level one. After you determine that the metadata of the song is the same (you could probably do this over bluetooth using GameKit), you could use the callbacks to determine if the time elapsed on the playback is the same, or within your tolerence.
If the devices are not near each other, you would need to send the metadata to your server similar to how Last.fm scrobbles now-listening tracks, then compare against known devices.

delphi, indy10 tcp audio streaming

i am trying to make an application that use video/audio streaming through TCP connection, i already done the video streaming with indy10 component(idtcpserver and idtcpclient), is it possible do the same thing but with audio?
Sure.
TCP is just data channel. It is totally agnostic to what kind of data is transferred to it. HTML pages, programs, video, audio - whatever. It is just a data channel within TCP protocol.
However, "streaming" usually means "near to real time". If some frames of video or audio did not arrived during few seconds - they better be skipped and forgotten and newer music or video be played. You would not want your Skype conversation suddenly stuck for a minute and then playback all that minute to you, just because of few seconds network jam. You'd better loose a word or two and then either recover by context or ask the correspondent to repeat. Thus TCP with built-in retransmissions and usually not very large buffers is not a perfect choice for multimedia streaming. Usually UDP + application-implemented integrity control is better choice for it.
I believe you need to use the unit VFW. With avistream, you join video + sound in a compressed stream.

How to synchronize audio playback on 2 or more iOS devices?

I would like to write a web application that allows me to sync audio playback of an MP3 down to ~50ms, or close enough that the human ear can't detect the difference.
The idea would be that two or more smartphones could each be paired to a bluetooth speaker, and two or more speakers would play the same audio at the exact same time.
How would you suggest I go about setting this up, both client-side and server-side? I'm planning to use Rails/Ruby for backend, and iOS/obj c for mobile dev.
I had though of the idea of syncing to a global/atomic clock on the server, and having the server provide instructions to clients on when to start playing/jump in to an already playing track. My concern is that, if I want to stream the audio, that it will be impossible to load a song into memory and start playback accurately on the millisecond level.
Thoughts?
The jitter in internet packet delivery will be too large, so forget about syncing over the internet. However you could check the accuracy of NTP which is still used (I guess, I know that older UNIX's used it) by the OS when you switch on automatic date/time in Settings, but my guess is that it won't be good enough either. But perhaps the OS may also use other time sources like GPS; I'm don't know how iOS does it but accuracy within 20ms is not to be expected. You could create experimental app to check it out.
So, what's left is a sync closer to home, meaning between the devices directly. Of course you need to make sure that all devices haves loaded (enough of) the song, and have preloaded it in AVAudioPlayer or whatever you're using, to be able to start playing immediately. (It may actually not be the best idea to use higher level 'AVAudioPlayer` API's as it may give higher delays, and what more important higher jitter, than lower level API's.)
Here are three ideas (one device needs to be master triggering the start play, the others are slaves that are waiting for the trigger):
Use an audio trigger pulse, like a high tone of a defined length and frequency. Then use FFT to recognise this tone.
Connect the devices via GameKit Bluetooth and transmit the trigger on these connections.
Use the iPhone 4+ flash as trigger: flash in a certain pattern. This would require you to sample the video data which is quite doable and can be very fast.
I'm going with a solution that uses an atomic clock for synchronization, and an external service that allows server instructions/messages to be sent to all devices in close sync.

iPhone music streaming

I'm trying to send music over bluetooth from one iOS device to another. I've been using this to build packets like in Ray Wenderlich's SNAP tutorial, but I've been having trouble reconstructing the packet information on the receiving phone. I have tried using https://github.com/abbood/iphoneAudioSyncer but I think it is too complicated for my needs (since I do not need synced playing). What is the simplest buffer approach that accounts for things like lost/out of order packets? I have read through a lot of CoreAudio stuff but it is very dense, so I would appreciate help from someone who has tackled this type of problem.
when you talk about los/out of order packets.. you're talking about the topic of Packet Loss Concealment.. which is a very dense topic (I mean if you think core audio is dense.. wait till you dive into PLC).
In a nutshell, there are many ways to deal with packet loss.. but the simplest way (which I advise you to do) is to replace the lost packets with silence (same goes with out of order packets.. if a packet is out of order.. just discard it).
that being said.. you are dealing with audio that is streamed to you (ie sent via the bluetooth/wifi network).. which means in almost 100% of the time it's compressed audio you're getting (ie Variable Bit Rate audio VBR).. if you simply try to substitute lost VBR packets with silence.. you'll run into this problem. You'll either have to insert silence packets in the same compression format as the VBR audio you're dealing with, or you will have to convert your VBR compressed audio into non-compressed audio (Lossless PCM), then insert zeros in place of the missing packets.

Directsound stream synchronisation

I have a question regarding the synchronization of 2 Directsound streams.
To record and play sound I currently use Portaudio to open 2 Directsound streams.
There are 2 callback functions which are called every time the input buffer is filled and the output buffer needs data.
Now here`s my problem...
The input stream is running at 48kHz samplerate (#1024 samples). The output stream is running at 192kHz samplerate (#4096 samples). Every time the input buffer is filled and the callback is called I do some DSP and after that I convert the result to 192kHz. The output stream takes the result and outputs the data. Now the 2 streams are running completely out of sync.
I have looked through the entire Portaudio API but I cant`t find a sync option to lock the 2 streams together.
Is there any way to lock 2 Directsound streams? I really need 48kHz input and 192kHz output.
Br,
Vincent Bruinink.
The thing is that you can't really open two streams "at the same time", nor can you open two devices (or even one device at two different sample rates) and expect them to stay truly in sync, even if they were, at one time, in sync. To understand why, you may want to read something about how audio works on a computer. You may also want to read this document, which is specific to PortAudio.
As an alternative, you may want to consider opening a single device in a single stream and using software sample-rate conversion.

Resources