I know Autopilot doesn't support Portuguese, but this is a major bummer.
My bot asks a "yes or no" question (sim ou não) and if the answer is without accentuation (nao), it doesn't understand it means 'não'.
I can't even type in 'nao' as a synonym in the custom field type, it says 'FieldValue already in use'.
I asked the Twillio support 9 days ago and haven't got a reply yet. What do you guys think? Is there a way around this? =/
Twilio developer evangelist here.
The Autopilot speech recognition engine is only working for English, so the transcribed text will always be in English: having "não" as a field value won't work for queries like "nao", but it would be helpful to add "nao" as a field value with synonym "não".
It seems your assistant does not have samples with fields which means if you do not use the Collect Action, those values will never get recognized even when not using voice, since the model does not have those values. Using Collect will be useful in your case; otherwise, I think you should add samples with fields to certain tasks.
Let me know if this helps at all!
Related
I am using Twilio studio flow to make an IVR and not want to miss a single command from customer. When we say a sentence Gather Input widget is working but we say a single word like sales .Widget not detected any word and trigger no input .Can someone give suggestion how we used Gather input in our flow that it detect even a single word. I used hints and set language as well. I also used speech model like number and command but I am not sure how we used it .
Waiting for Answer.
Thanks
I am trying to detected single words as well from customers in Twilio studio
You're not going to be able to do both well and need to direct your user to either say something short or say something long. I recommend you go with short for best results. Additionally, if you don't want to miss anything your customer said you might want to record your call and to post call analysis to see if there was anything important there. Honestly, if this is your first time introducing speech to your customers do this:
"Thank you for calling X, how may I help you?"
Customer says whatever.
"Let's try this a different way for X press 1, for Y press 2."
This helps you to better understand what your customer's would normally ask for in their own words and then have a better idea if you really need to capture long sentences or short words.
david
I am working in an application that gathers a user's voice input for an IVR. The input we're capturing is a limited set of proper nouns but even though we have added hints for all of the possible options, we very frequently get back unintelligible results, possibly as a result of our users having various accents from all parts of the world. I'm looking for a way to further improve the speech recognition results beyond just using hints. The available Google adaptive classes will not be useful, as there are none that match the type of input that we're gathering. I see that Twilio recently added something called experimental_utterances that may help but I'm finding little technical documentation on what it does or how to implement.
Any guidance on how to improve our speech recognition results?
Google does a decent job doing recognition of proper names, but not in real time just asynchronously. I've not seen a PaaS tool that can do this in real time. I recommend you change your approach and maybe identify callers based on ANI or account number or have them record their name for manual transcription.
david
I'm planning to transcribe a speech where the language is unknown, so I am trying to detect the language spoken automatically with multiple language codes given, however, I can't seem to find an option to actually find out which language the transcription will be in.
I've looked through the dev page of the speech-to-text api, but I can't seem to find a way to output the language code of the transcribed text.
Anyone could help me with this?
Thank you.
In general, the language code is returned with the results. For example, see the sample code here, which shows how to retrieve the language code from the results.
However, see the issue mentioned here. The language code does not always get returned when multiple languages are specified. As reported in the comments, this is an issue with the Google Speech API, an issue which reported here.
A simple search of "how alexa works" yielded no results so here it is.
If you go through the documentation for utterances the need to exhaustively list out all possible variations is ridiculous. For example you need to list down the following variations separately to support them.
what's my horoscope
what is my horoscope
what my horoscope is
Maybe I didn't interpret the documentation correctly but I'm just curious as to where exactly the machine learning algorithms come in for identifying intents and skills.
Any pointers to helpful resources will be fine too.
Just pure pattern matching on the transcribed text. We are still in 21st century ...
I am now doing an NLP project which needs some resources from twitter.
I want to get those tweets posted by "real people" instead of any kind of "official accounts", including celebrities, ads, institutions, media, etc. such as #CNN #TodayWeather #obama #DailySale #BestPrice #FashionTrend.
Hence, is there a better way to do so?
I have considered about it for a long time. By using twitter's API, the returned JSON includes a key called "verified". This can be used to detect weather an account is that kind of "official account". However, today, this blue "V" tick is not only for those shining celebrities. Anyone can apply for it as long as they are a real person. So, I think using this solution will rule out a lot of precious resources.
Moreover, I also considered using textual spam filter. yeah, of course, they are quite good in most cases. However, some accounts, such as #FT, their posts never sound like a spammy ad. But it is not what I want.
I want to ask for a better solution. It can be a long term solution, such as using NLP and NeuroNets to learn from labels. But, well, a prompt solution will be very welcomed.
THX