I am banging my head on this. Twilio Studio says it supports SSML using Amazon Polly voices on the say and gather widgets, https://www.twilio.com/docs/studio/widget-library/sayplay#ssml-support-for-polly-voices
I cannot make them work no matter what I try.
I tried using the examples from their docs, but nothing. What I currently have is this.
Twilio gather
I have also tried wraapping the whole block of text in valid ssml, using neural and non neural voices, single quoting, escaping. Nothing seems to work like the docs tell me it will.
When I look in the call log, the converted TWiML just strips all of the ssml. It looks like this
Twilio details
Any idea what I am doing wrong?
I figured this out in the end. The whole text block needed to be wrapped in <speak> </speak> block and the ampersands in the text needed to be removed.
Related
I am using Twilio studio flow to make an IVR and not want to miss a single command from customer. When we say a sentence Gather Input widget is working but we say a single word like sales .Widget not detected any word and trigger no input .Can someone give suggestion how we used Gather input in our flow that it detect even a single word. I used hints and set language as well. I also used speech model like number and command but I am not sure how we used it .
Waiting for Answer.
Thanks
I am trying to detected single words as well from customers in Twilio studio
You're not going to be able to do both well and need to direct your user to either say something short or say something long. I recommend you go with short for best results. Additionally, if you don't want to miss anything your customer said you might want to record your call and to post call analysis to see if there was anything important there. Honestly, if this is your first time introducing speech to your customers do this:
"Thank you for calling X, how may I help you?"
Customer says whatever.
"Let's try this a different way for X press 1, for Y press 2."
This helps you to better understand what your customer's would normally ask for in their own words and then have a better idea if you really need to capture long sentences or short words.
david
I have my twilio flow configured right, but even though the app recognizes the word it doesn´t follow the path it should, i tried to put all possible variations of the world in the widget, but made no difference.
I'm planning to transcribe a speech where the language is unknown, so I am trying to detect the language spoken automatically with multiple language codes given, however, I can't seem to find an option to actually find out which language the transcription will be in.
I've looked through the dev page of the speech-to-text api, but I can't seem to find a way to output the language code of the transcribed text.
Anyone could help me with this?
Thank you.
In general, the language code is returned with the results. For example, see the sample code here, which shows how to retrieve the language code from the results.
However, see the issue mentioned here. The language code does not always get returned when multiple languages are specified. As reported in the comments, this is an issue with the Google Speech API, an issue which reported here.
Is there an event or something that could be used to indicate what word is being spoken currently?
I can't find anything in the documentation but I want to double check.
I need that so, e.g., it's possible to move back X words.
Thank you
Polly can generate speech marks, i.e. the position and timing of each word. Using this information, you could certainly achieve what you have in mind.
To generate speech marks, simply call the SynthesizeSpeech API using the 'json' output format.
https://docs.aws.amazon.com/polly/latest/dg/using-speechmarks.html
I am looking for a better alternative to the 'Alice' voice provided by Twilio. I am pretty sure Twilio only provides two basic default voices along with 'Alice', a more robust version able to more effectively enunciate text. The only problem is that 'Alice' does not sound as natural as other voices used by other services known to be using Twilio. Does anyone have a suggestion as to how to access a better voice? The messages in the call flows will be somewhat dynamic so I don't think using recordings would be practical.
Thanks!
Alice is our most advanced voice that supports additional languages and locales.
Some things you can try to make Alice work better for you might include looking at the <Pause> verb for more deliberate separation between sentences.
If you'd like to get additional control as it sounds like you have experienced with other services using Twilio, you could consider prerecording the static parts and deliver them using <Play> and only using <Say> for your dynamic content, though I know this isn't ideal.
There are some additional hints here https://www.twilio.com/docs/api/twiml/say#hints.
I hope this helps!