How can I programmatically trigger an Alexa Skill intent? - iot

I have a few smart devices that can all be controlled through Alexa (e.g. "Hey Alexa, turn on the fan to 50% speed"). Is there an API that I can use to programmatically trigger certain Alexa Skill intent. I know there is a way to use text-to-speech to do so, but that feels very janky, and I would love to know if there's a native API for Alexa intents that can be programmatically triggered.

No, Alexa does not expose an API. The whole point of Alexa and the Echo devices is to provide a voice interface which then calls an API to get things done. It seems like you want an API to trigger a voice response system that then calls an API to get something done.

Related

How to send direct command from Google Home to custom smart device without app name?

I try to build my custom IoT device that will be controlled via Google Home device, and serve people with disabilities.
The device itself is Tiva C Launchpad, that I program from scratch, meaning I will have a full control on it.
In my vision, the user wil say something like: "Ok Google, press play button", and as a result, the Google Home device will send a direct command of press_play_button to the IoT device, preferably via the local network.
I found the Google Action SDK, alongside with the Local SDK extention, but if I understood correctly, I have to be in the app mode first ("OK Google, play {app_name}") before pronouncing the action I want, which is inconvenient.
Is there any way to achieve my requirement?
If not, I may give up on the local network control, and use sort of a webhook to send HTTP request to my smart device, and in that case I wonder if MQTT will be more suitable.
Thanks.
The Local SDK is an extension to the Smart Home API. If your device matches up with the device types and traits that the Smart Home API supports then you can use that to control your device.
It has support for media players so things like play/stop should be possible.
I have build generic Smart Home control using MQTT to reach the device, but you have to provide a HTTP endpoint for the Google System to interface with. This take a little thought as you have to map MQTT asynchronous approach to HTTP's synchronous nature.

Is it possible to stream voice to directline API from native app on iOS using Swift or Objective-c

I have a bot that leverages all the cool tech that comes with botframework, e.g. LUIS, QnA maker, adaptivecards, etc. The bot works well and I can use WebChat to connect to and ask questions and get responses. However, I now need native iOS (and eventually Android) app that can perform much like webchat does but I do not want to embed webchat in a web control in native app. I plan to have voice always on leveraging something like snowboy or picovoice for hotword to wake app and send commands to bot - users would ask things like "hey bot what is weather in Boston" and get presented with result message or adaptivecard.
Is voice steaming to directline API from Swift on iOS possible (I know most things are possible so any pointers would be greatly appreciated)? Or am I approaching from wrong angle and perhaps there is better/easier way to achieve my goal?

Speech to Text using Twilio

We use microsoft botframework for our chatbots. We would want to enable Voice channel to our bot. Is there a way to solution this? Does Twilio have anything that can add speech capabilities to our bot. Our bots are exposed via webchat components, skype, facebook messenger etc.
Twilio developer evangelist here.
There's no way within Botframework to add voice capabilities from Twilio, however receiving calls works in a similar way. When someone calls your Twilio number you receive a webhook which you can respond to with TwiML to tell Twilio what to do with the call.
To then perform things by voice action you can <Record> the caller's response and set the transcribe parameter to true. You also need to set a transcribeCallback URL as the transcription is done asynchronously. Once you receive that callback, the text of the transcription will be available as a parameter in the request. You could also perform the transcription yourself with a third party service by just taking the recording and sending it off.
Once you receive the transcription you can then make your decision as the the next step of the conversation and redirect the live call to the next step of your process using the REST API.
This is just a high level overview of how you might accomplish this. Let me know if it is of any help.
Voximal offers as Twillo a similar product but based on VoiceXML. The difference is that Voximal integrates natively most of STT engines (Microsoft, Google, Watson, iSpeech) in the solution (you only need to set the key or the user/password to configure them). You use a builtin grammar "text" to translate. Then the processing is very similar to the Twilio. You need to push the content to a chatbot engine (HTTP/XML/JSON), and you have a way to play the result with a TTS engine.
Have a look to the Parrot example (a script that repeats all you said using the STT and TTS) :
https://github.com/voximal/voicexml-examples/blob/master/parrot/parrot.vxml

How can I get Alexa working on my iOS app?

I have been checking out the Alexa Skills kit the past few days. I have also been poring through the documentations for both the Skills kit and the Voice Service. I am just having a little hiccup trying to understand the flow. I have implemented one of amazon's sample skills (favourite colour sample) in the developer console and also wrote a sample lambda function to handle the type of response that will be delivered. Its working on the test simulator and what left is basically getting lambda running through my ios app. However I have the impression that I don't have to use the voice service. Am I wrong? I am quite confused, it would be awesome if anybody who has some more clarity could shed some light on the matter. If I get lambda working also, I think it will accept requests that are in a particular format. Where do I have to send the encoded audio to get a json response to send to the skills kit? To the Alexa Voice Service?
Also I am authenticating my app using cognito and dynamo db. If I were to use Alexa Voice Service, then it is mentioned that the user will have to also login to amazon. So do I still have to work with the login with amazon sdk? Or is there a workaround?
Based on Amazon documentation there are two ways to interact with Alexa:
Sounds like you want to implement the app thru the Companion method.
As far as the JSON goes, i am currently resolving that issue now, (will post answer once I have it resolved).
Basically you have to use AVFoundation to capture audio from iPhone and send 2 https messages to Alexa (One message with JSON Body & the second message with audio captured as body.) Bases on Documentation
Companion App
(You have a device (such as a smart speaker) that you want to add Alexa to. So, you build in support for AVS. Great! Now you need a way to authorize it and associate it with the user's account. This is the "companion app" approach. The companion app connects to your smart product and allows the user to login and authorize the speaker to use Alexa and connect to their Amazon account.)
Mobile OR Website
AVS App
(You don't have a device you need to authorize - instead you want to speak to Alexa from within your Android/Iphone application.)
Android or Iphone
You can find a swift example on github on how to implement a iOS AVS client
https://github.com/chintan1891/iOS-Alexa

Is it possible to be integrate Alexa Skills Kit with iOS mobile app to trigger a event?

For example, say I want to launch camera within my ios app to take a photo, Can I utilize ASK and iphone microphone to understand users speech command("launch camera") to launch camera and trigger a function within ios?
Short answer: No.
Long answer: Yes, but not really a good us case.
The Alexa Skills Kit is used to create an endpoint (skill) that the Alexa service can invoke to do something. The Alexa Skills Kit does not cover the acquisition and recognition of the speech. That all happens within the Alexa service.
The Alexa Voice Kit is a newly released kit that lets you do the other end. You can use it to integrate anything with a microphone into the Alexa service. Using this kit you can stream raw sound recorded from your microphone up to the service, and it will do the recognition and invocation of services.
So, in theory, you could use the Alexa Voice Kit and enable your iphone as a device that could send sound and invoke Alexa skills. You could use the Alexa Skill Kit to enable a web service that can be invoked by Alexa. You could do some sort of push/pull notification between your web service and your phone to trigger whatever you want. Then you would be able to invoke your Alexa skill with your iPhone, and have it do something on your iPhone.
Probably not where you really want to go. The better way to implement that use case is to use whatever speech recognition and trigger options iOS gives you and not via Alexa.

Resources