What are the endpoints for the new Microsoft speech service WebSocket APIs? - translation

I want to use the new MS Speech Translation API, but I am working with Go so there is no SDK. I have a WebSockets implementation for the previous Translator Speech API, so raw WebSocket are no issue.
The documentation states that it is using WebSockets, but I was unable to find the endpoints in the documentation. Does anyone know what are the WS endpoints and their path/header parameters?
EDIT:
The documentation also says: "If you already have code that uses Bing Speech or Translator Speech via WebSockets, you can update it to use the Speech service. The WebSocket protocols are compatible, only the endpoints are different." But the new endpoints are missing.

After digging into the binaries of client SDKs I have found the Speech Translate API to be wss://<REGION>.s2s.speech.microsoft.com/speech/translation/cognitiveservices/v1
Another problem is that the WebSocket protocol is NOT compatible despite the documentation says so. Good thing is that after experiments I have found out that the new Speech Translation WS API uses the same protocol as the old Bing Speech WS API, except for URL query parameters. The Bing Speech API has a language parameter and the Speech Translate preview API has from, to, voice and features. The from and to work as expected, you can even send more languages in to (comma separated and the TTS is missing). I have not tried the voice. The features looks like doing nothing and there are always partial results, timing info and TTS.
The responses are also different, but similar to Bing Speech. They have headers and there are multiple different JSONs. Just observe the raw strings.
As this is a preview API it can change at any time.

There hasn't been substantial changes in the Websocket protocol, so the old documentation should be reasonable accurate.
The Microsoft Cognitive Services Speech SDK doesn't support GO yet, it is on the roadmap, but will not happen this calendar year.
thx
Wolfgang

Related

Gcloud, ruby on rails, speech to text

I am trying to use Google's new speech to text api: https://cloud.google.com/speech/docs/rest-tutorial . They currently have python and node.js examples.
Unfortunately, my application is RoR. I was looking through https://github.com/GoogleCloudPlatform/gcloud-ruby , which is a gem that interacts with google cloud services (but not speech). I was hoping that I could use the two together to come out with a working solution, but my knowledge of how to use API's is limited.
Enough background, my questions are:
Does anyone know if Google is going to put out a Ruby version of the speech to text api? If yes, is there a timeline?
If I am impatient, how would I go about using their current API's. By this I mean, is there a good resource for someone to learn how to use generic API's?
The gcloud-ruby gem now supports google-cloud-speech.
To address your other questions, there are no language specific versions of the APIs themselves. They are all HTTP APIs (either REST or gRPC), so they can be used from anything that can make HTTP requests. It can be tricky to use them directly though, because of things like how authentication is handled, which is why client libraries exist for different languages.
If you want to learn more about how to use the REST APIs directly, first take a look at the doc 'Using OAuth 2.0 for Web Server Applications' to find out how to manually authenticate, which has examples for Ruby and raw HTTP/REST.

iOS App and youtube client

Time ago I searched informations about the integration of youtube into an ios application.
Now I need to do this again so I started looking for information on google.
After a short time are already confused.
Can I use this iOS youtube sample
or have I to use YouTube Data API (v3)?
And this?
Short answer:
The API refers to the HTTP interface for consuming Google's funcionality.
One can use these APIs by issuing HTTP requests directly, according to the
specification of the API, or by using one of the client libraries. The client libraries are a layer on top of HTTP that issue the HTTP requests and parse the responses. They give a simpler interface for invoking the API (e.g. using standard function calls in the given programming language rather than building HTTP requests) and they also simplify a lot of the complex parts such as authentication, refreshing tokens, etc.
Long answer:
An application programming interface or API is the "contract" between a provider of some functionality and the consumer of some functionality that allows both the provider and consumer of that functionality to interoperate without knowledge of the underlying implementation of the other party. This "contract" includes such things as the number and types of the inputs, the names of the inputs (if it is required to invoke the functionality), any constraints on the inputs, the expected outputs, any constraints on the outputs, failure modes, etc.
Google provides a number of HTTP-based APIs for accessing functionality from its services. Its services implement these APIs, which are consumed by issuing HTTP requests and reading the HTTP responses. HTTP is a convenient protocol to implement, because every device and language can speak HTTP; however, it is not always the most convenient to use as a developer. In many cases, the inputs and outputs you want are objects, not HTTP requests and HTTP responses. And, in many cases, matching function signatures in the language of your choosing and type-checking of inputs is more convenient than memorizing the HTTP request paths or manually serializing/deserializing your objects to HTTP requests or content sent within the request. That is where the client libraries come in. Whereas the HTTP APIs are implemented on Google's servers, the client libraries are libraries that developers include in their application and are distributed to the devices on which those applications run. The client libraries issue the HTTP requests and interpret the responses, and provide a more convenient programming language-specific wrapper, for a variety of different programming languages.
The data API link that you provided is documenting the HTTP-based API. Whereas the sample application is using the client library (which is invoking the HTTP-based API under the hood). The last link you provided, the cloud endpoints for iOS is unrelated to what you are trying to do; it is documenting a mechanism called Cloud Endpoints, a feature of App Engine, that allows developers to create their own HTTP APIs using Google's infrastructure and to auto-generate client libraries that wrap these HTTP APIs (much as Google auto-generates the client libraries for its own HTTP APIs).
Here's a sample app you can get started to build YouTube APIs on iOS.
Also there is an helper library to play YT videos in iOS.

Google Maps Geolocation API over http

I'm implementing the Google Maps Geolocation API as explained here. https://developers.google.com/maps/documentation/business/geolocation/
The problem is my server is implemented in delphi and using https is quite complex,
is there a way to use this API in plain http?
I know there are some security issues, but this feature in particular will only exists in a cache subsystem where no sensitive data is ever send or stored.

What to consider first when designing a meta-search engine using Erlang, Mnesia and Yaws?

Can someone explain to me what to consider first when designing a meta-search engine using Erlang, Mnesia and the Yaws web server? This engine should have SMS capability but I am still wondering how I am going to incorporate this feature...
The meta search engine, you need REST or Ajax APIs from Google, Yahoo and Bing. Below am providing you with examples which you may use within your back end HTTP capable Library or your front end JavaScript. I personally use mochiweb and yaws Appmods.
For example: Google has an Ajax search API which works like this:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=computers
Hitting that URL will give you a JSON Object which contains several search responses. In this case, the search term is "computers"
Yahoo has what it calls Boss APIs. An example of Yahoo Rest search API using Boss is here below:
For an XML result:
http://boss.yahooapis.com/ysearch/web/v1/animals?appid=APPID&format=xml&start=1&count=3
For a Json result:
http://boss.yahooapis.com/ysearch/web/v1/animals?appid=APPID&format=json&start=1&count=3
Analyse the whole HTTP GET query very well, you notice something they call an APPID. This you will get when you register with them here. I cannot give to you my APPID, you will have to get yours,then paste it in there and you will be good to go. Yahoo has something more powerful called
YQL. In the above query, the search term is: "animals"
Bing as well has got an API for you, but you will need an APPID:
http://api.bing.net/json.aspx?AppId=APPID&Query=love&Sources=Web&Version=2.0&Market=en-us&Web.Count=10
Above, the search term is: "love"
About the Meta Search Engine
You have a web page, people enter search queries in this page. You use your javaScript (JSONP). JSONP could be implemented in any one of your favorite JavaScript Framework you use e.g.
JQUERY,Ext JS,Dojo, Prototype e.t.c
Then you would have to parse the XML or JSON response from the three sources (Google, Yahoo and Bing),and make an appropriate display for your users to navigate the results.
About the SMS part
SMS capability is attained using SMS Gateway. There are several open and close source SMS Gateways. the most powerful of them all is the one built in Erlang/OTP technology called: OSERL, but to test it, you need direct connection with an SMSC in anyone of your local service provider.You need a Port on their SMSC, a user name and a password.There is another one which is better for development reasons called: NowSMS because it has capabilities for USSD, Modem Internet Communication, SMSC service connectivity, HTTP 1.1 and HTTP 1.0, configuration of two-way SMS messaging e.t.c from a Web App to-and -from the SMS Gateway. Go to their site, grab the trial version, follow the documentation and then configure two-way from your web app to the gateway and vice versa. Since NowSMS is not free, you can try: Kannel, it is open source but you will need help from the community to set it up on your Unix or Linux box.
More on incorporating SMS capability in Web Applications can be found:
Here
I also asked once a Question related to development of a powerful search engine using Erlang, Mnesia & YAWS webserver on Stackoverflow. I got plenty of good answers and responses.
Please CLICK ME!
Hope this may help. As I am not sure about SMS thing.

Use Google Gears geolocation from a python app

I'd like to use the Google geolocation API in my app, written in Python. My problem is that Google provides a JSON interface (easily useable from Python) but from http://code.google.com/p/gears/wiki/GeolocationAPI I see that the API "is published to allow developers to provide their own network location server for use through the Gears API. Google's network location server is only to be used through the Gears API. See section 5.3 of the Gears Terms of Service at [address]."
It is a very strange thing: there is a very cool JSON but I cannot use it. I have to use it through Google Gears instead. But how can I do it from a Python app?
For example, I see that the geolocation service provided by Firefox calls directly the JSON API. Why is FF able to do that?
Thanks,
Alessio Palmero Aprosio
Google has deprecated Gears entirely, as the geolocation feature is now standard in modern browsers (for certain values of "standard").
The pylocation module may provide the information you need. It can output the geolocation data in text, json, or xml.

Resources