I'm using on-device speech-to-text in my app, and I was hoping to use contextualStrings to help it recognize some more niche vocabulary. However, it never recognizes any of the words. I am following all of the guidelines they outline here, but they never seem to be recognized when I have requiresOnDeviceRecognition set to true. They do get recognized when I set requiresOnDeviceRecognition to false, however. Is it because the Speech-to-Text engine is better with it off? Or does contextualString require the internet to work? I couldn't find any documentation saying it needs internet.
Example:
...
private let contextualStrings = ["hipaa", "cologuard"]
recognitionRequest.contextualStrings = contextualStrings
Hipaa always becomes hyppa for some reason (which isn't even a word on the system vocabulary?)
Interestingly, if I were to say "HIPAA violation" does HIPAA actually get recognized by the on-device speech-to-text engine. This seems to be only time it ever works
Related
I was investigating various Speech Recognition strategies and I liked the idea of grammars as defined in the Web Speech spec. It seems that if you can tell the speech recognition service that you expect “Yes” or “No”, the service could more reliably recognize a “Yes” as “Yes”, “No” as `No”, and hopefully also be able to say “it didn’t sound like either of those!”.
However, in SFSpeechRecognitionRequest, I only see taskHint with values from SFSpeechRecognitionTaskHint of confirmation, dictation, search, and unspecified.
I also see SFSpeechRecognitionRequest.contextualStrings, but it seems to be for a different purpose. I.e., I think I should put brands/trademark type things in there. Putting “Yes” and “No” in wouldn’t make those words any more likely to be selected because they already exist in the system dictionary (this is an assumption I’m making based on the little the documentation says).
Is a way with the API to do something more like grammars or, even more simply, just providing a list of expected phrases so that the speech recognition is more likely to come up with a result I expect instead of similar-sounding gibberish/homophones? Does contextualStrings perhaps increase the likelihood that the system chooses one of those strings instead of just expanding the system dictionary? Or maybe I’m taking the wrong approach and am supposed to enforce grammar on my own and enumerate over SFSpeechRecognitionResult.transcriptions until I find one matching an expected word?
Unfortunately, I can’t test these APIs myself; I am merely researching the viability of writing a native iOS app and do not have the necessary development environment.
How would I setup an AirPlay video & audio receiver for iOS (and then save the stream as a video file)?
I know that this goes against Apple's guidelines, this is not intended for AppStore distribution. I am fine using private APIs.
Note: I am using Pythonista (with objc_util), so, if possible, answers written in Python will be very helpful, although Swift/Objective-C is still greatly appreciated.
I assume you have this idea after the recent incredible (but short-lived) Vidyo app went on the App Store. I managed to snag a copy before it was taken down, but recreating this effect in Pythonista is certainly desirable.
You could start with the unofficial AirPlay specification, which describes how the AirPlay protocol works. Specifically, you want the section on Screen Mirroring. From this, you may be able to put together an AirPlay interface.
I don't think objc_util will be necessary for this, Python provides some pretty low-level networking modules.
From reading the spec, you'll need to set up a server. Flask probably can't handle it, Flask is likely too high-level. It looks like the AirPlay streaming stuff doesn't even stay within the realm of valid HTTP requests.
I suspect you'll have a lot of trouble with this. The AirPlay spec (especially screen mirroring) is pretty complicated. You'll need to let your server receive a H.264-encoded video livestream (this is the same format Apple uses to livestream its events), and you'll also need to set up a system for synchronizing your video based on data sent through a separate stream. On top of all this, you'll need to provide some endpoints that return data about your server.
I suppose it's entirely possible that Vidyo found some private APIs that make this easier. I don't see any clear reason why iOS would implement an AirPlay server somewhere, but it's not outside the realm of possibility. If this exists, I'm not aware of it. You'll have to do more research.
Good luck ;)
All,
Apologies in advance - this question might be too open-ended for SO.
Anyway... A friend of mine (an engineer and entrepreneur) is in the process of building a high-tech piece of lab equipment. He's asked me about the feasibility of building an iPhone/iPad/iPod application that would allow users to control the device via Bluetooth, so I'm helping him gather some information. I'm hoping to get a few pointers on how to get started. Specifically:
Would this require a native app, or could this be accomplished with HTML5 (with or without something like PhoneGap?)
Can you point me to a good primer on bluetooth networking? Everything I've found assumed a VERY high level of pre-existing knowledge.
What are the basics on how something like this is accomplished? Is there a single, established protocol for how one device "controls" another, or is bluetooth more like SSL - just a pipe that allows you to convey any type of message?
I realize this question is incredibly broad and detailed - so I'm not really looking for specifics. But obvious Google searches don't turn up much, and I'm otherwise having a hard time finding a good starting point.
Thanks in advance.
You can communicate via bluetooth in two ways: One is using the Low Energy Bluetooth capabilities of iOS 5 and newer iPhone/ipads.
https://developer.apple.com/library/ios/#documentation/CoreBluetooth/Reference/CoreBluetooth_Framework/_index.html#//apple_ref/doc/uid/TP40011295
Unfortunately the documentation is sparse and will require some hacking away. If you choose this route I would consider starting here and learning as much as you can about how the protocols work before hacking into the framework:
http://developer.bluetooth.org/gatt/services/Pages/ServicesHome.aspx
The limitations of this route are that it might not be best for sending a lot of data. I have only built stuff that sent simple commands which it does work great for.
The other option is the external accessory framework. This will require you to get an mfi license from apple (not fun). You will also need to pay royalties. But it will do what you want. You won't need to concern yourself much with underlying protocols if you use this, the framework provides a friendly api for processing streams.
http://developer.apple.com/library/ios/#documentation/ExternalAccessory/Reference/ExternalAccessoryFrameworkReference/_index.html
Is a BlackBerry feasible for writing code with?
Additionally: Are there some programming languages which are specially easy or hard to type on it's keyboard?
I don't know BB well, but I'm trying to evaluate it for writing code while commuting. AFAIK, it doesn't have a "full-featured" keybard, and there are other phones with a more complete one. But I'd also like to be able to write text single-handedly, and I believe it's possible on a BB (?).
Note: that's not a question about writing code for a BlackBerry, but rather on one.
(If that's not a good site for such a question, please let me know where could I pose it. I've let myself put it here based on the "matters that are unique to the programming profession" entry in the FAQ.)
No. Consider a netbook if you need something cheap you can write code with on-the-go. I've tried too, but mobile phones (no matter how smart they are) can't be used for writing code (unless you're willing to spend 1 hour to write a "Hello, World!" app).
So, I wrote a quick little app for the iphone that takes in an http URL, and plays the .mp4 video located at that URL. It does more than that, of course, but that's the meat of it. Naturally, I wanted to have it on more than just a single mobile platform, so I decided to target BlackBerry next.
However, I'm running into a lot of problems with the BlackBerry Environment. First of all, I learn that I can only download 256k files! I learn how to set that variable in my MDS simulator and I learn that this is NOT a production solution, because any end users will have to have their BES or MDS admin change the setting there. Then, I find a video less than 2 MB I can practice with. Going to the browser prompts me to save the video rather than it playing in the browser like I expected. After saving the video, it refuses to play, saying it's the wrong format.
So. I can't find a reference to whether BlackBerry can stream with HTTP. I've heard it can use RTSP, though, and heard some rumors that it can't use HTTP, which would really suck. I also can't find a reference to what format BlackBerry uses, although I can find a million programs that will convert one file to the 'BlackBerry' format.
Surely SOMEONE must have tried to stream video with the BlackBerry before. How did they go about doing so? Is it just a hopeless pipedream? Will I have to go with RTSP?
Sorry for the lack of a concrete question. I'm just really lost, and I hate how so many tutorials or forum posts seem to assume I know the capabilities of the Blackberry.
Edit: I finally found out that the .3gp format, which I'd never heard of, is what BlackBerry uses. Still have no idea how to stream videos off the web, though. I found "How To - Play video within a BlackBerry smartphone application" That seemed useful, but the code doesn't work if you give it a URL, even though it claims it does.
While you are correct that the tutorial claims the code will load any valid URL, the API documentation for javax.microedition.media.Manager.createPlayer specifies "A locator string in URI syntax that describes the media content" which may not, in fact be the same as any valid URL. Luckily createPlayer will also take an InputStream and a String specifying the content type. So you should be able to open the URL as documented in the API for HttpConnection, grab the content type string, and open the input stream to create the player.
I will admit that I haven't done that, but it would be my next step.
BTW remember to run your HttpConnection fetch on a thread separate from the application event thread.