Twilio Stream. How to recognize a worker and customer speech differently? - stream

As a developer, I want to recognize a worker and customer speech differently via Twilio Stream. The problem is all speech comes to the server with track='inbound' attribute. I see no difference in a request.

Related

Issue with character limit in Twilio and responding back manually

I'm facing some challenges with Twilio and was wondering if someone knows how to resolve them. First, would like to highlight that I don't have a programming background but I can write low-level code.
Current Case:
A session is booked > Record is created in Airtable via a Zap with calendly > Twilio sends a message using the info in Airtable via a zap with calendly.
Problems:
I'm currently only able to send 160 characters in the message. If the message is longer, only the first 160 characters are sent and the message is cut short (e.g. looking forwa....)
I don't have an easy way to respond back to messages. Currently Twilio forward all messages to a Voice number in one big thread and I have to respond back in this format
+1123456789: Message.
This is confusing and causes a lot of issues.
Is there any way to build something in Twilio which allows for manual responding?
Thank you all in advance!
Twilio developer evangelist here.
I'm currently only able to send 160 characters in the message. If the message is longer, only the first 160 characters are sent and the message is cut short (e.g. looking forwa....)
This sounds like a limitation of the Zap you are using. SMS messages are 160 characters in length, but Twilio can stitch together up to 10 messages, allowing you to send up to 1600 characters. Twilio doesn't truncate the messages either, so it sounds like something the Zap is doing. In 2017 Zapier did announce that you could send longer messages with their integration, you just had to update your Zap with the right setting.
I don't have an easy way to respond back to messages. Currently Twilio forward all messages to a Voice number in one big thread and I have to respond back in this format +1123456789: Message. This is confusing and causes a lot of issues. Is there any way to build something in Twilio which allows for manual responding?
You could certainly build something. Or integrate with an app like Front which can give you a shared inbox for things like incoming SMS from Twilio.

Forwarding call recordings in Twilio for Zoho

I use Twilio and Zoho Phonebridge for inbound and outbound calling to customers. I use Zoho's built in IVR which works well, but has gaps. The main gap is when I get a voicemail, I need to login to Twilio to get it and that's a time sink. I would like Twilio to email (or text, for all I care) those messages to me, transcribed or not. Problem is, I cannot modify the code provided by Zoho, so I need to have the function carry out in parallel. I'm not sure where to start. I can do code snippets easily enough, but I seems I need to replace the "Voice Configuration - Request URL", which kills the IVR.
Any help?

Twilio video: recording rooms server-side

Context: we're building a HIPAA-compliant video chat, and evaluating Twilio as a potential supplier for video streams. Part of the requirement is that we need to make a recording on each video -and this needs to be stored encrypted in a HIPAA-compliant storage.
Having set up Twilio's excelent quickstart example, I've started a server, and were able to connect with two clients to it, with videos. However, looking around Twilio's room configuration, the server-side recording appears to refer to Twilio-based storage, which is not HIPAA-compliant.
Question: In what ways can we configure the started Node server to save a local copy of all streams participating in a room?
Thank you!
Twilio developer evangelist here.
When you set up a group room based video chat using Twilio Video all participants in the chat make WebRTC connections to a Twilio server in order to transmit and receive data via the room. When you turn on recording, the video that passes through the server is then written to disk. As far as I'm aware, this is not HIPAA compliant.
We do have a page on building HIPAA compliant video applications with Twilio Video but the advice is to use peer to peer rooms so that the only media that potentially goes through Twilio (via the TURN relay) is encrypted and can't be read or saved by Twilio.
You can't record the video on the Node server from the quickstart, because that's not used to stream the media at all. It only exists to generate an access token.
You could build a server that also joined the peer to peer room of the chat and saved the video that way. I have no experience in building WebRTC server applications though, so I can't help guide you with that. It's certainly not a case of just configuring the server differently.
Your other option would be to record the video in the client and somehow transfer that to your server. That might be unwieldy though for long chats that would cause extra work on the client and result in a potentially large video file to send to the server.

Twilio: How do I always place a "All calles are being monitored message" for incoming calls?

For incoming calls:
1) I am new to twilio, but I always want a "All calls are being monitored or recorded" to play for all incoming calls. What is the best way to do this?
2) I would like to create two messages after the "monitoring" message is played. one message during open hours and a second message during closed hours.
What is the best way to do this? Any good documentation?
Twilio developer evangelist here.
Welcome to using Twilio! I'll give you a quick overview of how incoming calls to Twilio work then point you to some useful parts of our documentation that will help you achieve what you are working towards.
When a Twilio phone number receives an incoming call, Twilio will send an HTTP request to your web application, asking for instructions on how to handle the call. Your web application will respond with an XML document containing TwiML. That TwiML contains the instruction that Twilio will follow to say some arbitrary text, play an MP3 file, make a recording and much more.
In your case you want to read messages to the caller, you could either do that by returning TwiML that uses <Say> to read out the messages using our text to speech engine. Or you could record yourself reading the message and play that to the caller using the <Play> TwiML.
To learn more:
Follow the Programmable Voice Quickstart
If you need more specific instruction on a particular Twilio feature, check out the Twilio Guides
If you need to see Twilio features as part of a complete application, check out the Twilio Tutorials which cover more specific use cases
Let me know if that helps at all.

Speech to Text using Twilio

We use microsoft botframework for our chatbots. We would want to enable Voice channel to our bot. Is there a way to solution this? Does Twilio have anything that can add speech capabilities to our bot. Our bots are exposed via webchat components, skype, facebook messenger etc.
Twilio developer evangelist here.
There's no way within Botframework to add voice capabilities from Twilio, however receiving calls works in a similar way. When someone calls your Twilio number you receive a webhook which you can respond to with TwiML to tell Twilio what to do with the call.
To then perform things by voice action you can <Record> the caller's response and set the transcribe parameter to true. You also need to set a transcribeCallback URL as the transcription is done asynchronously. Once you receive that callback, the text of the transcription will be available as a parameter in the request. You could also perform the transcription yourself with a third party service by just taking the recording and sending it off.
Once you receive the transcription you can then make your decision as the the next step of the conversation and redirect the live call to the next step of your process using the REST API.
This is just a high level overview of how you might accomplish this. Let me know if it is of any help.
Voximal offers as Twillo a similar product but based on VoiceXML. The difference is that Voximal integrates natively most of STT engines (Microsoft, Google, Watson, iSpeech) in the solution (you only need to set the key or the user/password to configure them). You use a builtin grammar "text" to translate. Then the processing is very similar to the Twilio. You need to push the content to a chatbot engine (HTTP/XML/JSON), and you have a way to play the result with a TTS engine.
Have a look to the Parrot example (a script that repeats all you said using the STT and TTS) :
https://github.com/voximal/voicexml-examples/blob/master/parrot/parrot.vxml

Resources