Twilio video: recording rooms server-side - twilio

Context: we're building a HIPAA-compliant video chat, and evaluating Twilio as a potential supplier for video streams. Part of the requirement is that we need to make a recording on each video -and this needs to be stored encrypted in a HIPAA-compliant storage.
Having set up Twilio's excelent quickstart example, I've started a server, and were able to connect with two clients to it, with videos. However, looking around Twilio's room configuration, the server-side recording appears to refer to Twilio-based storage, which is not HIPAA-compliant.
Question: In what ways can we configure the started Node server to save a local copy of all streams participating in a room?
Thank you!

Twilio developer evangelist here.
When you set up a group room based video chat using Twilio Video all participants in the chat make WebRTC connections to a Twilio server in order to transmit and receive data via the room. When you turn on recording, the video that passes through the server is then written to disk. As far as I'm aware, this is not HIPAA compliant.
We do have a page on building HIPAA compliant video applications with Twilio Video but the advice is to use peer to peer rooms so that the only media that potentially goes through Twilio (via the TURN relay) is encrypted and can't be read or saved by Twilio.
You can't record the video on the Node server from the quickstart, because that's not used to stream the media at all. It only exists to generate an access token.
You could build a server that also joined the peer to peer room of the chat and saved the video that way. I have no experience in building WebRTC server applications though, so I can't help guide you with that. It's certainly not a case of just configuring the server differently.
Your other option would be to record the video in the client and somehow transfer that to your server. That might be unwieldy though for long chats that would cause extra work on the client and result in a potentially large video file to send to the server.

Related

How does Twilio's "Programmable Video" work?

I'm building a streaming iOS app in Swift. Looking at the docs https://www.twilio.com/docs/api/video I understand that you can create live video chat rooms on the fly.
My use case is a bit different:
User A access a room, hit 'record' and start streaming a video of himself to Twilio storage. Creates a thumbnail in the UI. User B enters the same room and click the video thumbnail - that video should be streamed down to User B.
If user A is talking (Streaming up) and user B is in the room at the same time, it should be possible to 'Go live', which would start a live video chat room that other users can join too.
Main question: Does Twilio Programmable Video allow streaming up and down using their storage?
Secondary question: Would you say Twilio Programmable Video is the right choice for this use case or would you recommend another service?
Twilio developer evangelist here.
I'll answer this the other way around that you asked if that's ok.
If User A is currently streaming to a room and recording it (having created the room in group mode with RecordParticipantsOnConnect set to true) and another user wants to join the room, then they can. They just need an access token that gives them access to the room. They will then be able to join the room and chat and be recorded too.
Once a recording is complete, you will receive a webhook to the statusCallback URL that was set for the room. The callback for the recording will have the recording-complete and will include a MediaURL for the recording as well as the Uri and Sid for the recording resource.
You can use the media URL or the recording resource to get the binary data, which for videos will be in .mkv format. If you want to stream this video to your users, you may want to download the video and convert to a playable format. Or upload it to a streaming service.
Let me know if that helps at all.

Is it possible to use the YouTube Live Stream API to broadcast through my phone camera?

I want to create a basic app that allow users to simply start to broadcast a video through their phone camera (front and back) just by pressing a button.
Does the YouTube live stream API allow me to handle the video streaming process?
If so, is YouTube Live Stream API totally free of charges and will never ask me to pay something if I reach a certain amount of usage?
Creating a Live Event and Live broadcast is language and hardware agnostic, just use YouTube's Live Streaming HTTP API. Read through the Core Concepts and Life of a Broadcast guides.
Your flow might look something like this:
Authenticate the user.
Set up and schedule your Live Broadcast object.
Start your video encoder and create a Live Stream Object.
Bind your Live Stream to your Live Broadcast.
Test to verify your video is going through.
Set your Live Broadcast to Live.
At the conclusion of your event, set your Live Broadcast to Ended.
Note that setting up your encoder is on you. Asking "How do I create an RTMP or DASH video encoder for [hardware or software]" is too broad of a question for Stack Overflow.
The YouTube API is free to use within a specific quota. If you hit that quota limit, there are ways to request additional quota from Google (potentially for a fee).
I answered a similar question about integrating with YouTube's Live Streaming API on iOS here: YouTube live on iOS?

What's more appropriate to use Twilio Client vs Twilio Video API?

I've been reading Twilio docs lately and been confused between "Twilio Client" and "Twilio Video".
I noticed that tutorials for "Twilio Client" involves registering phone numbers while the beta "Twilio Video" would not need one (after doing the tutorial) for browser-to-browser audio only call.
I would like to wrap via Phone Gap a nodejs app that has audio call only (no numbers being dialled but rather identity). (android phonegap app for now since iOS does not support webrtc yet)
Am I correct that I should be experimenting on "Twilio Video" instead of "Twilio Client". From the docs they both support WebRTC but somehow "Twilio Client" needs numbers or maybe I'm missing something.
Thank you for your input.
Twilio developer evangelist here.
The difference between Twilio Client and Twilio Video when making audio calls is very much around phone numbers. Twilio Client has the ability to make app to phone network calls and to receive calls from real phones. Twilio Video does not have those abilities, it is purely for app to app calling.
Twilio Video will likely have better quality audio though, as Twilio Client gets downsampled so that it will work over phone networks.
It's likely that Twilio Video would be cheaper for your app to app use case as well. Twilio Client is priced by the minute and Twilio Video pricing is more to do with currently connected endpoints as the connection, if it is peer to peer, costs nothing.
Let me know if that helps at all.
Twilio Product Marketer here just to add on to what Phil provided.
We provide two separate real time communications SDKs: our Programmable Video SDKs as well as our Client SDKs. Video, which we launched last year, provides both voice and video capabilities (or either combination there of) and media flows in a peer-to-peer or TURN relayed call topology. As Phil mentioned, this SDK uses a newer codec (VP8& H.264) that can provide HD audio & video and is also more resilient to packet loss and challenging network conditions. Our Video SDKs do not have media server capabilities yet like recording, connecting to the phone network, or scaling beyond about 4 participants. But stay tuned... :)
Our Client SDK which we've had since 2011 supports voice only and all media flows through Twilio's cloud infrastructure, not peer-to-peer. Our Client SDKs (iOS, Android, and JavaScript) support recording, connecting to the phone network, and large conferences. However, this SDK doesn't support video and uses the G.711 codec.

What is the major role of Streaming Media Server?

I am new to Live streaming of a data. I have been exploring in a web about how to live stream a Video. Actually I am an iOS developer and I want to develop an App that streams video.
I am clear about the fundamentals of live video streaming. I came to know that I will be need a Streaming Media Server which will feed the stream to the viewer. I also came to know that viewer has to have a player which decodes the data and synchronize the audio/video stream.
Now, Wowza is a kind of Streaming Media Server which is recommended. But, I have following questions..
(1) Why Media Server? Why we can't have our own Media server? What actually Media Server do that makes its role necessary ?
(2) In my App, I will have to integrate a library for encoding and feed to a streaming server like Wowza. But, how it would be fed to the streaming server ?
(3) How will my server communicate with a streaming server like Wowza ?
(4) How Wowza will feed the stream to the receiving side i.e. the user having an iPhone and needs to see a live stream.
(5) What should be at the receiving side. What will decode the stream and will play the stream to AVPlayer ?
Guys, I need to develop a streaming App with better quality. So, better I first understand the flow of data and then start.
It would be great if someone gives a graphical representation of the data flow.
Thanks a lot in Advance !!!
Let me quickly add my understanding to your questions:
1a. Why Media Server? ..
You could write your own software for distributing the stream data to all the players as well. But in that case you would need to implement various transport protocols and you would end up implementing a fairly big piece of software, your home grown media server.
1b. What actually Media Server do to make its role necessary?
A way to see the role of the media server is to either receive the live stream from a stream source and handle the distribution of this stream to probably many-many other players. This usually involves taking the data out from the source transport protocol and repackage it into one or more other container format or transport protocol that the clients favour. Optionally the Media Server can change the way the video or the audio is encoded (transcoding), or produce different resolution and quality streams and provide the players with the list of available qualities in the form of a manifest file (e.g. m3u8 or smil file) so they can do so called adaptive streaming.
An other typical use-case of Media Servers is serving non-live video files to players from disk, as well as recording live streams, and so on. If you look at the feature list of popular media servers, you'll see that they are really doing many things, so practically this is something you probably want to get out of the box and not implement your own.
In my App, I will have to integrate a library for encoding and feed to a streaming server like Wowza. But, how it would be fed to the
streaming server?
You need to encode the video and audio with a particular codec (such as H.264 for video and AAC for audio), then you need to choose a suitable container format to put these streams into (e.g. MPEG-TS) and then choose a transport protocol to push the stream to the server (e.g. RTMP). Best if you google for tutorials to see how this looks like in code.
How will my server communicate with a streaming server like Wowza?
The contract is basically the transport protocol, one example is using RTMP protocol to connect to Wowza and publish the stream to it. These protocols cover all the technical details.
How Wowza will feed to the stream to the receiving side i.e. the user having an iPhone and needs to see a live stream.
The player software will initiate the communication with Wowza. This is again protocol dependent but in case you are using HLS, the player will use the HTTP protocol to find out the URL of the consequtive video chunks that it will progressively download and display to the user.
What should be at the receiving side. What will decode the stream and will play the stream to AVPlayer ?
It's not clear whether your app under development is the broadcaster side or the player side. But generally on the player side you need to find a library that is able to pull the stream from the media server with the protocol/transport/codec you are using. I am not familiar with this part in iOS, I only have experience with players embedded in websites.
I am not going to draw this, but imagine 3 boxes connected with arrows and that's the data flow. From encoder to streaming server and finally to player. That's it I guess.. :-)

Streaming video/audio from iOS device

I have read several posts here about live streaming video/audio from iOS device while user is recording. Unfortunately it seems that there is not any "good" solution.
I understand that I must have access to files while I am recording and then send files to server from which other users can watch my stream live (with a small time lag).
Working with iOS is not problem for me, I am more struggling with part where data should be handled to server and the whole processing on server.
I have several questions:
Saying just server is very vague, what "kind of" server it should be?
I understand that I must use some protocol to send data TO server and then to get data FROM server so user can watch live video, what protocol should I use?
I feel very lost with whole server side processing, what should be done with files that were sent to server?
All this seems to be very nontrivial is there any third party solution? For example what technology apps like Periscope, Ustream or Meerkat use to provide live stream feature for their users?
I would also really appreciate if possible answers would more than one word long for each question.
Please find my answers to your questions:
There is a class of software called "media servers". E.g. Wowza, Red5, Nimble Streamer, nginx-rtmp-module and a few others.
Most common protocols for sending data TO media server are RTMP and RTSP. Watching the video is done via several ones like RTMP (requires Flash installed), HLS (native for iOS, supported by Android 4+, working on some web-players), DASH (supported by some players).
No files needed, media server can process incoming live stream and handle connections from viewers.
Basically they use combination of mentioned technologies plus their own "know-how".

Resources