WhatsApp audio media message (MediaUrl0) Transcribe to text - twilio

I have a dialogflow chatbot that communicates with Whatsapp for business users, thru Twilio.
I would like to enhance the "chat" chatbot capability, and allow whatsapp users to also be able to send a voice messages.
WhatsApp voice media messages sent to Twilio have a URI parameter with the location of the media file, but this URI does not have a file extension. How can i extract the file to send it to a Speech-to-text service (Google or AWS) to have it transcribed into text and then send it to Dialogflow for intent recognition
Any ideas how i would go about doing this?
Twilio message log for a media message:
Request Inspector
+ Expand All
POST
https://xxxxxxxxxxxx
2021-04-27 08:35:39 UTC502
Request
URL
ParametersShow Raw
MediaContentType0 "audio/ogg"
SmsMessageSid "MMea4e6bcb3a9654a03d8d2a607c6d4cdd"
NumMedia "1"
ProfileName "xxxxx"
SmsSid "MMea4e6bcb3a9654a03d8d2a607c6d4cdd"
WaId "xxxxxxxxx"
SmsStatus "received"
Body ""
To "whatsapp:+32460237475"
NumSegments "1"
MessageSid "MMea4e6bcb3a9654a03d8d2a607c6d4cdd"
AccountSid "ACef27744806d8f8e68f25211b2ba8af60"
From "whatsapp:+32474317098"
MediaUrl0 "https://api.twilio.com/2010-04-01/Accounts/ACef27744806d8f8e68f25211b2ba8af60/Messages/MMea4e6bcb3a9654a03d8d2a607c6d4cdd/Media/ME27fbc66d47d8de49f1ae00e433884097"
ApiVersion "2010-04-01"
Message TextShow Raw
sourceComponent "14100"
httpResponse "502"
url "https://xxxxxxxxx"
ErrorCode "11200"
LogLevel "ERROR"
Msg "Bad Gateway"
EmailNotification "false"

I think you don't need the extension for this use case, you will probably need the language code for the resulting text and may be, AudioEncoding and sample rating for the transcription service.
Here is some examples from my code for whatson / google coud speech to text and DialogFlow.. AWS and Microsoft are very similar
//for ibm watson
RecognizeOptions recognizeOptions = new RecognizeOptions.Builder()
.model(RecognizeOptions.Model.ES_ES_NARROWBANDMODEL)
.audio(new ByteArrayInputStream(bytes))
.contentType(HttpMediaType.AUDIO_WAV)
.build();
// google speech to text
RecognitionConfig config = RecognitionConfig.newBuilder()
.setSampleRateHertz(48000)
.setLanguageCode(langcode)
.setEncoding(RecognitionConfig.AudioEncoding.OGG_OPUS)
.build();
// Dialogflow (sending audio directly)
InputAudioConfig inputAudioConfig = InputAudioConfig
.newBuilder()
.setLanguageCode(langcode)
.setSampleRateHertz(sampleRateHertz)
.build();
In the end, in all cases, what you send to the service is not a file but an array of byte (sort of)
Anyway, even when there is no one to one relation between content Type and file extension, the parameter "MediaContentType0" in the request give you a good starting point: "audio/ogg".

Related

Retrieving Message Service Name from Twilio's Phone Numbers API

I'm using the IncomingPhoneNumber resource (https://www.twilio.com/docs/phone-numbers/api/incomingphonenumber-resource) to retrieve infomration about my phone numbers in Twilio.
Both .ReadAsync and .FetchAsync return numbers that I've bought via the Twilio console, and some of those numbers are in a Sender Pool for messaging services.
However, the payload returned by either of those two methods does not contain whether or not a phone number is in a message service pool.
On the console, you can see if a phone number belongs to a message service.
Is it possible, using the IncomingPhoneNumber Resource REST API to find out if a phone number is part of a messaging service?
No, the IncomingPhoneNumber resource won't be able to tell you that. But the PhoneNumber Resource of the Messaging Service API will be able to tell you (or help modify the assignments).
// Download the helper library from https://www.twilio.com/docs/node/install
// Find your Account SID and Auth Token at twilio.com/console
// and set the environment variables. See http://twil.io/secure
const accountSid = process.env.TWILIO_ACCOUNT_SID;
const authToken = process.env.TWILIO_AUTH_TOKEN;
const client = require('twilio')(accountSid, authToken);
client.messaging.v1.services('MGXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')
.phoneNumbers
.list({limit: 20})
.then(phoneNumbers => phoneNumbers.forEach(p => console.log(p.sid)));

How to use twilio bi-directional stream feature to play raw audio data

I'm using Twilio Programmable Voice to process phone calls.
I want to use bi-directional stream feature to send some raw audio data to play by twilio, the initialization code looks like,
from twilio.twiml.voice_response import Connect, VoiceResponse, Stream
response = VoiceResponse()
connect = Connect()
connect.stream(url='wss://mystream.ngrok.io/audiostream')
response.append(connect)
Then when got wss connection from twilio, I start to send raw audio data to twilio, like this
async def send_raw_audio(self, ws, stream_sid):
print('send raw audio')
import base64
import json
with open('test.wav', 'rb') as wav:
while True:
frame_data = wav.read(1024)
if len(frame_data) == 0:
print('no more data')
break
base64_data = base64.b64encode(frame_data).decode('utf-8')
print('send base64 data')
media_data = {
"event": "media",
"streamSid": stream_sid,
"media": {
"playload": base64_data
}
}
media = json.dumps(media_data)
print(f"media: {media}")
await ws.send(media)
print('finished sending')
test.wav is a wav file encoded audio/x-mulaw with a sample rate of 8000.
But when run, I can't hear anything, and on twilio console, it said
31951 - Stream - Protocol - Invalid Message
Possible Causes
- Message does not have JSON format
- Unknown message type
- Missing or extra field in message
- Wrong Stream SID used in message
I have no idea which part is wrong. Does anyone know what's my problem? I can't find an example about this scenario, just follow instructions here, really appreciate it if someone knows there is an example about this, thanks.
Not sure if this will fix it but I use .decode("ascii"), not "utf-8"
Question is probably not relevant anymore, but I came across this while debugging my bi-directional stream, so it might be useful for someone:
Main reason why were you receiving this error because of the typo in json content. You are sending "playload" instead of "payload".
Another issue when sending data to twilio stream is that you should send mark message at the end of data stream to notify twilio that complete payload was sent. https://www.twilio.com/docs/voice/twiml/stream#message-mark-to-twilio
When sending data back to twilio stream, be aware that payload should not contain audio file type header bytes, so make sure you remove them from your recording or alternatively skip them while sending data to twilio.

Twillio Notify Bulk Sms With Custom Number

I want to send bulk sms using twillio notify in php having a custom text in place of the number ("From") but dont seem to know how to go about it. Am using a messaging service. I would like to show the custom text instead of my sending number when the message is sent. Below is my code for sending the message
<?php
require_once '/path/to/vendor/autoload.php';
use Twilio\Rest\Client;
$accountSid = "your_account_sid";
$authToken = "your_auth_token";
$serviceSid = "ISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX";
$client = new Client($accountSid, $authToken);
$recipients = array($num1, $num2, ...); // Your array of phone numbers
$binding = array();
foreach ($recipients as $recipient) {
$binding[] = '{"binding_type":"sms", "address":"+1'.$recipient.'"}'; // +1 is used for US country code. You should use your own country code.
}
$notification = $client
->notify->services($service_sid)
->notifications->create([
"toBinding" => $binding,
"body" => $text
]);
?>
You might be able to send them from Twilio verified phone numbers, but in-general that's not good practice, as your "From" number will get flagged as spam by phone companies
Further than that, you won't be able to change the 'From' attribute, as Twilio doesn't want you fraudulently impersonating other people's phone numbers.
If you can't afford a short code, use a messaging service and buy a bunch of numbers to accommodate your volume.
Twilio developer evangelist here.
When you are using a messaging service, you can setup an alphanumeric sender ID as part of the copilot features for the service. Open up your messaging service settings and add an alphanumeric sender as shown in the screen shot below:

Is there any way to get messages we send/received by using phone number in twilio?

I would like to make my client to check whether end client is received text or not and what reply he/she has sent? I always going through twilio to see whether client received sms or not? Is there any way to check it from twilio?
Twilio developer evangelist here.
You can get both incoming messages to your Twilio numbers and reports on the status of messages after you send them from Twilio using webhooks.
When you send a message you can include a StatusCallback parameter. The parameter should be a URL in your application.
$client->messages
->create(
$to,
array(
"from" => $from,
"body" => "Let's grab lunch at Milliways tomorrow!",
"statusCallback" => "https://example.com/messageStatus"
)
);
Twilio will send a POST request to the statusCallback URL each time your message status changes to one of the following: queued, failed, sent, delivered, or undelivered. It will include the original message's SID, so you can tie these webhooks back to the message you sent.
Similarly, you can get these webhook notifications for incoming messages to your Twilio numbers. For this you need to set up the incoming webhook URL to the number in your Twilio console. Set it to a URL in your application and you will receive a webhook when someone sends a message to your Twilio number. Check out this quickstart guide on receiving messages to your Twilio number with PHP.
Let me know if that helps at all.
[edit]
Thanks for the comment where you made it clear that this is after the fact, not at the time of sending.
In this case, you can absolutely list the messages by the phone number that sent them. A message resource includes a Status attribute that lists the current message state in the Twilio system, anything from "accepted" and "queued" to "sending", "sent", "delivered", "undelivered" and "failed". You can see more about these statuses in the documentation.
To get the list of messages sent from a number you can use the following code:
use Twilio\Rest\Client;
// Your Account Sid and Auth Token from twilio.com/user/account
$sid = "ACXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX";
$token = "your_auth_token";
$client = new Client($sid, $token);
// Loop over the list of messages and echo a property for each one
foreach ($client->messages->read(array("from" => $FROM_NUMBER) as $message) {
echo $message->status . " - " . $message->to . " = " . $message->body;
}
You can pull data for specific messages https://www.twilio.com/docs/api/messaging/message#delivery-related-errors
or you can simply pull all the logs

Twilio API for making outbound calls with a speech stream

I have a scenario where say at 5.00 AM every morning, I have a server side script / batch job that wakes up, selects a phone number from a list based on an algorithm, places a call to that phone number and uses text-to-speech to deliver a customized message. I have 2 questions,
Which Twilio API can I use to achieve this? Bear in mind there is no app UI and all the code would be on the back end. Think NodeRED flow or a Python script that is made to run at a given time.
Instead of specifying the text in the TwiML, can I pass say an audio stream from Watson's Text to Speech to the appropriate Twilio API?
To do this, you would need to use the programmable voice API from Twilio. This lets you play audio files, text to speech, make and manipulate phone calls, etc. I have never used Watson Text-to-Speech, but, if it can output an audio file you can play that with Twilio TwiML.
Here's an example in Node.
npm install twilio
//require the Twilio module and create a REST client
var client = require('twilio')('ACCOUNT_SID', 'AUTH_TOKEN');
client.makeCall({
to:'+16515556677', // Any number Twilio can call
from: '+14506667788', // A number you bought from Twilio
url: 'url/to/twiml/which/may/have/WatsonURL' // A URL that produces TwiML
}, function(err, responseData) {
//executed when the call has been initiated.
console.log(responseData.from); // outputs "+14506667788"
});
The TwiML could look like this:
<Response>
<Play loop="1">https://api.twilio.com/cowbell.mp3</Play>
</Response>
This would play the cowbell sound from the Twilio API. Just a default sound. This could be easily generated to play a Watson sound file if you can get a URL for that.
You could do the same thing in Node, if you'd rather not build the XML manually.
var resp = new twilio.TwimlResponse();
resp.say('Welcome to Twilio!')
.pause({ length:3 })
.say('Please let us know if we can help during your development.', {
voice:'woman',
language:'en-us'
})
.play('http://www.example.com/some_sound.mp3');
If you were to take this toString() it would output formatted XML (TwiML):
console.log(resp.toString());
This outputs:
<Response>
<Say>Welcome to Twilio!</Say>
<Pause length="3"></Pause>
<Say voice="woman" language="en-us">Please let us know if we can help during your development.</Say>
<Play>http://www.example.com/some_sound.mp3</Play>
</Response>
Hopefully this clears it up for you.
Scott

Resources