Using Services to create a Speech to Text Function with Vosk? - twilio

wanted to get some additional opinions on a project.
I'm attempting to create a system which will gather the dialog from a call commenced by Twilio Studio and transcribe it. I then intend to push the now transcribed dialog to my CRM.
I found a helpful article on speech to text with the use of Vosk; https://www.twilio.com/blog/transcribe-phone-calls-text-real-time-twilio-vosk - however, I think the article assumes that this will be done with the use of an external application. So, my question is; could this be possibly developed with the use of Twilio Services and leveraged within Studio?
If not, I could develop a web app to connect everything together, but would rather everything be housed within Twilio as I'm a novice developer and developing this app externally sounds like a rather over complex solution to this project.
Thanks in advance for your feedback!
Nothing attempted yet, still in the early research stage of this project.

As of now, Twilio does not offer its own Speech-to-Text engine. However, you can use media streams to forward the audio track to any engine out there on the internet.
This blog post uses Google's engine, for example.
The advantage of such a hosted service is that you neither need to worry about the model nor manage the server. If you want to do 100% serverless, then you could run the entire thing on Twilio Serverless.

Related

Twilio Use Case

I am considering using Twillio as an extension in an existing application.
My use-case is this:
User clicks button in application
Using Twilio API, the application calls the user.
The user answers their phone
Twilio connects user to some phone number. (fetched from db)
It's a bit strange, but it is exactly my customer's request. Before I spend too much time in the rabbit hole, I thought I would ask the community, can I do this with Twilio APIs?
Twilio developer here!
This is definitely a common use case for Twilio. In fact, it's so common that we wrote up an in-depth tutorial showing you how to build an app like the one you described.
We've got it in PHP, Node, Python, and Ruby - here's the PHP version:
https://www.twilio.com/docs/howto/walkthrough/click-to-call/php/laravel
If you prefer to just reference the code, you can find it on GitHub too: https://github.com/TwilioDevEd/clicktocall-laravel
Yes. That's actually very simple in Twilio:
https://www.twilio.com/docs/api/rest/making-calls

iOS How to allow users to register an account / How to make a database of users

I am in the planning stages of building an App for iphone / ipad (yes, very early stages)
I am basically wondering how much work is involved in having a seperate user registration process for an app i.e. letting users register an account and use login using that account and use the app.
Will this involve constructing / coding an entirely new database or is there software available that automates this process?
thanks in advance
You could have a look at a service like StackMob.
This allows you to utilise server based services with no server-side implementation on your part.
These guys here: parse.com are doing a great job to facilitate developers the setup of a cloud database to do many tasks that are common in iOS apps.
In particular there is a section dedicated to user management (sign-up and sessions) that is well described here: Parse iOS guide
Finally the service offers some user interface help also, look here even if probably it is better to give to the UI some personalization by coding your own UI.
There are some implementations, but if your app is going to have custom code executed by server, you'd better make your own code.
Use a server side language (php, perl, ruby, python, java) to do the registration.
You'll probably need a REST service and/or json if you are going for easy peasy stuff (if you are to web apps programming). Otherwise, you'll need to do xml parsing and other stuffs. Use asi-http for the interactions between server and the app, or if you are using ios5.x it has already a json parsing implementation.

Creating a VoiceXML application

I have a few questions concerning how to create a VoiceXML application.
I found some nice tutorials, but there are still some questions:
-what's a good development environment? I wanted to use VS08, there should be under C#, a project called "speech", but it doesn't appear, do I have to install the speech server local too in order to use this? (I would prefer some kind of visual workflow)
-what's the ending? is it .xml, .aspx, or .speax? I couldn't get that.
-how do I run the voicexml? it's at the speech server as an application, any further steps?
These questions are all over the map on the basics, but I'll try to provide some pointers:
what's a good development enviroment?
You will likely be building a web style application. So a VS08 ASP application is a reasonable starting point.
do i have to install the speech server local too in order to use this?
Yes. There are a variety of platforms that support VoiceXML. Nearly all are designed specifically for telephone calls (VoiceXML's main purpose). There are a few free implementations, but most are commercial. I believe the Opera web browser has some VoiceXML functionality. I've seen settings for it in their configuration, but no direct experience.
what's the ending? is it .xml, .aspx, or .speax ? i couldn't get that.
Endings usually aren't relevant, except maybe to tools. I don't believe VisualStudio provides any direct support for VoiceXML. Some browsers do care what mimetypes are provided.
how do i run the voicexml? it's at the speech server as an application, any furhter steps?
Does this mean you are looking at the OCS/Lync product line ? I believe their IVR in that suite does support VoiceXML as well as a few other APIs. The product should contain basic setup and configuration information. More information on Lync:
Microsoft Lync site
Wikipedia
One of the main goals of VoiceXML was to decouple the rendering of the voice application (on a speech server) from the voice application itself. This allows you to serve VoiceXML pages from any web server, anywhere, using any technology stack you want.
If you just want to learn VoiceXML in general, developer sites like Voxeo's Evolution allow you to render your voice applications on their voice hosting infrastructure. You configure your developer account to point to an initial VoiceXML page served from your external web server. In return, you get a phone number to call. When you call it, the hosting infrastructure fetches your initial VoiceXML page from your web server.
(I don't know offhand if Microsoft Lync hosting services are available yet.)

How can I get twitter running on my local server?

I want to put the Twitter service on my server and customize it for my purpose. I have no idea how it works.
My goal is to communicate to your own Twitter server rather than the original twitter server and serve my purpose.
You should check out: StatusNet. It is an open source micro blogging platform. From their site, you can download the source and deploy it on your own server. Once you have it installed you can customize it to your liking.
Twitter isn't an Open Source project - they don't provide their server code.
From my experience at another company deploying very widely distributed systems, the chances are there's a bucket-load of infrastructure you'd need to get running first - complete overkill for a single-server solution, but vital for a global service with many millions of users. In other words, even if Twitter did provide their code, it probably wouldn't be an appropriate solution for your situation.
The actual Twitter (twitter.com) service is proprietary, you can't run it yourself.
There are plenty of open source twitter clones out there. The more general name is "microblogging". Pinax for example has basic microblogging. Try searching google for 'open source microblogging' for other projects.
I don't believe the Twitter platform is freely available to the general public. If you want to make your own "Twitter server", you're going to have to clone the service yourself.
You can't run Twitter on your own server, but you can write your own application that talks to Twitter through Twitter's API.
It all depends on what you mean by "customizing" Twitter. There are many applications like Twitpic and TweetDeck that are built "on top of" Twitter. They add their own functionality while leaving Twitter to do the "heavy lifting".
For example, I have written a personal project for moderating a stream of tweets. This application runs on my local server, but it gets its data by querying Twitter's API.
There are two main advantages to extending rather than rebuilding Twitter:
It takes a lot less effort because you can reuse all the basic functions of Twitter
You can take advantage of Twitter's huge user base. Even if you succeeded in cloning Twitter, it would be far less interesting than the original because Twitter works by strength of numbers.
You could use Wordpress and get the twitter developer add in then get a api code from them and there users can use your site and vice versa also apps for twitter will work for your site.
Wow. That's a highly ambitious request that you have there. Twitter isn't like Wordpress, there's no .org version that can be downloaded and run locally. Twitter is a highly scalable service that is designed to run on large scale servers.
Sorry to be the bearer of bad news to you on this.

How to provide your app with a network API

I am going to write a Ruby application that implements a video conversion workflow consisting of multiple audio and video encoding/processing steps.
The application interface has two core features:
queueing new videos
monitoring the progress for each video
The user can access these features using a website written in Ruby on Rails.
The challenge is this: I want make the workflow app a self-sufficient application, not dependent on the existence of the web view.
To enable this separation I think that adding a network API to the workflow application is a good solution because this allows the workflow app to reside on a different server than the web server.
My question is: Which solution do you suggest for such a network API?
A few options are:
implement a simple TCP server and invent my own string based API
use some sort of REST api (I don't know if this is appropriate for this situation)
some sort of web-services solution (SOAP, XML-RPC)
another existing framework
Feel free to share your thoughts on this.
I would suggest two things:
First, use REST as your API. This allows you to write one core application with both a user interface and an API for outside applications to use.
Second, take a look at PandaStream. It's a Merb application that encodes videos from multiple formats into flash. It has a REST API, and there's even a Rails plugin so you can integrate it with your application. It might be a good example codebase, or even a replacement for the one you're trying to build.
Hope my answer helped,
Mike

Resources