I have been playing around with Cloud Speech API and noticed that it returns punctuation for English but not for Japanese when enableAutomaticPunctuation is set to true.
Does anybody know What languages does Google Cloud Speech Automatic Punctuation Support?
Speech-to-Text can provide punctuation in audio transcription text for 'en-US' language only.
EDIT MAY 2020: Now, Speech-To-Text supports more languages
Update: as of May 2020, several languages have punctuation supported, including Japanese. They have a full list of languages they support and features that are supported for each language listed here.
Related
I'm trying to build a software that will identify the language being spoken.
My plan is to use Google's cloud speech to text to transcribe the speech, and put it through cloud translation api to detect the langauge of the transcription.
However, since speech to text requires language code to be set prior to transcribing, I was planning to run it multiple times with different sets of languages and compare the "confidence" value to find the most confident transcription, that will be put through to cloud translation api.
Would this be the ideal way? Or would there be any other possible options?
Maybe you can check the Detecting language spoken automatically page in google cloud speech documentation.
I'm using AVSpeechSynthesizer but it supports only a few languages, not Vietnamese. Is there any way?
I prefer the engine which can use offline and cheap too. Or whatever if there is no other way.
Is there an option to automatically detect the spoken language using Google Cloud Platform Machine Learning's Speech API?
https://cloud.google.com/speech/docs/languages indicates the list of the languages supported and user needs to be manually set this parameter to perform speech-to-text.
Thanks
Mahesh
As of last month, Google added support for detection of spoken languages into its speech-to-text API. Google Cloud Speech v1p1beta1
It’s a bit limited though - you have to provide a list of probable language codes, up to 3 of them only, and it’s said to be supported only for voice command and voice search modes. It’s useful if you have a clue what other languages may be in your audio.
From their docs:
alternative_language_codes[]: string
Optional A list of up to 3 additional BCP-47 language tags, listing
possible alternative languages of the supplied audio. See Language
Support for a list of the currently supported language codes. If
alternative languages are listed, recognition result will contain
recognition in the most likely language detected including the main
language_code. The recognition result will include the language tag of
the language detected in the audio. NOTE: This feature is only
supported for Voice Command and Voice Search use cases and performance
may vary for other use cases (e.g., phone call transcription).”
Requests to Google Cloud Speech API require the following configuration parameters: encoding, sampleRateHertz and languageCode.
https://cloud.google.com/speech/reference/rest/v1/RecognitionConfig
Thus, it is not possible for the Google Cloud Speech API service to automatically detect the language used. The service will be configured by this parameter (languageCode) to start recognizing speech in that specific language.
If you had in mind a parallel with Google Cloud Translation API, where the input language is automatically detected, please consider that automatically detecting the language used in an audio file requires much more bandwidth, storage space and processing power than in a text file. Also, Google Cloud Speech API offers Streaming Speech Recognition, a real-time speech-to-text service, where the languageCode parameter is especially required.
I would like to translate an English word to its phonetic transcription work. Only text convert. Like I have work hello how are you so this will be converted to /ˈheˈləʊ haʊ ə jʊ/. Any one know the API? Any demo link please share with me. .
Try the WikSpeak in sourceforge:
https://sourceforge.net/projects/wikspeak/
WikSpeak is a tool that allows non-native English speakers to analyze the correlation between the pronunciation and spelling of English words. This program is a simple and fast graphic interface which can retrieve the phonetic transcription ( IPA ) and the pronunciation of any English word, while avoiding the annoying process of browsing dictionaries. The most outstanding features of WikSpeak are the savings in time to determine the pronunciation of English words and the ease in understanding phonetic transcription.
WikSpeak is a highly recommended tool for anyone from the beginner to the advanced non-native English speaker.
I read a few papers about machine translation but did not understand them well.
The language models (in Google translate) use phonetics and machine learning as best as I can tell.
My question then becomes is it possible to convert an Arabic word that is phonetically spelled in English to translate the users intended Arabic word?
For instance the word 'Hadith' is an English phonetic of the Arabic word 'حديث'. Can I programmatically go from 'Hadith' to Arabic?
Thanks the Wiki article, there's an entire field of work in the area of Transliteration. There was a Google API for this that was deprecated in 2011 and moved to the Google Input Tools service.
The simplest answer is Buck Walter Transliteration but at first glace a 1:1 mapping doesn't seem like a good enough idea.
I am going to try to see if there's a way to hack the Google Input tools and call it even at CLI level because their online demo works very well