Microsoft Speech API - Cognitive Speech STT iOS : Language not changing

Microsoft Speech API - Cognitive Speech STT iOS : Language not changing - ios

I have started recognition using
_micClient = [SpeechRecognitionServiceFactory createMicrophoneClient:SpeechRecognitionMode_ShortPhrase withLanguage:locale withKey:API_KEY withProtocol:(self)];
Everything worked as intended.
But, second time using the same with another locale, recognition is only in the first language.
Eg: App launched and starts recognition with "hi-IN"
Application Name: com.XXXX.XXXX/1.0.1 STS:
https://api.cognitive.microsoft.com/sts/v1.0/issueToken Refreshing
token /sts/v1.0/issueToken Initializing Audio Services Initializing
Speech Services No application id provided to controller
GetIdentityPropertyValue 3 Useragent Value iOS Assistant (iOS;
11.2.6;Mobile;ProcessName/AppName=com.XXXX.XXXX/1.0.1;DeviceType=Near;SpeechClient=1.0.161216)
Url: 'https://websockets.platform.bing.com/ws/speech/recognize'
Locale: 'hi-IN' Application Id: '' Version: 4.0.150429
UserAuthorizationToken: ServerLoggingLevel: 1 Initiating websocket
connection. m_connection=0x0 host=websockets.platform.bing.com
port=443 Auth token status: 200 Authorization token hr 0 'Bearer
eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzY29wZSI6Imh0dHBzOi8vc3BlZWNoLnBsYXRmb3JtLmJpbmcuY29tIiwic3Vic2NyaXB0aW9uLWlkIjoiMGZhNGQ5NmZjODc5NDA1ZmIyZDc3ZGVmY2NiOTc0MzUiLCJwcm9kdWN0LWlkIjoiQmluZy5TcGVlY2guUzAiLCJjb2duaXRpdmUtc2VydmljZXMtZW5kcG9pbnQiOiJodHRwczovL2FwaS5jb2duaXRpdmUubWljcm9zb2Z0LmNvbS9pbnRlcm5hbC92MS4wLyIsImF6dXJlLXJlc291cmNlLWlkIjoiL3N1YnNjcmlwdGlvbnMvZjJmNWJmMGYtZTRlOC00NDY1LTg4ZDQtYmMyMGFiYTNmMTIzL3Jlc291cmNlR3JvdXBzL1NwZWVjaFJlY29nbml0aW9uL3Byb3ZpZGVycy9NaWNyb3NvZnQuQ29nbml0aXZlU2VydmljZXMvYWNjb3VudHMvU3BhcmtsaW5nU3BlZWNoIiwiaXNzIjoidXJuOm1zLmNvZ25pdGl2ZXNlcnZpY2VzIiwiYXVkIjoidXJuOm1zLnNwZWVjaCIsImV4cCI6MTUyMjkyODc1OH0.PTBvhZ18q__-PCJRtWLr-KkQ99yt4c-mnrd2kdyOn1c'
Successfully initialized client connection Create ImpressionId:
fff94b5814ae9a097f0d749c137069d9 Create ImpressionId:
01eb6b249fc1d90e37ba61a1a2d64fe9 Reset
Create ImpressionId: e69685c047daf66ef0887614b2a35fc4 ImpressionId:
b53b312c6dfd13609e5b1cf2952f0af6 Adding requestId:
'cadbb055d5d4ef5c669d210a5fed2bf7' for 'text/cu.client.context'
Subscribing request [cadbb055d5d4ef5c669d210a5fed2bf7] Audio stream
created Adding requestId: 'e9012ec9fe3d9ee9e8a075e6274eda06' for
'audio/x-wav' Subscribing request [e9012ec9fe3d9ee9e8a075e6274eda06]
Audio Stream Created Creating transcoder 2
Upgrade request returned with HTTP status code: 101 Web socket
handshake completed CU Client connected ConnectionStateChanged
Microphone permissions: 0 Sent first chunk of audio stream,
requestId='e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Speech recording
started Speech recording started OnDataAvailable: 81 => type 1
Received message: 'audio.stream.response' Response request id:
'e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Response impression:
'b53b312c6dfd13609e5b1cf2952f0af6'
LanguageGeneration OK Partial : आप OnDataAvailable: 81 => type 1
Received message: 'audio.stream.response' Response request id:
'e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Response impression:
'b53b312c6dfd13609e5b1cf2952f0af6'
LanguageGeneration OK Partial : आपके OnDataAvailable: 81 => type 1
Received message: 'audio.stream.response' Response request id:
'e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Response impression:
'b53b312c6dfd13609e5b1cf2952f0af6'
LanguageGeneration OK Partial : आप किस OnDataAvailable: 81 => type 1
Received message: 'audio.stream.response' Response request id:
'e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Response impression:
'b53b312c6dfd13609e5b1cf2952f0af6'
LanguageGeneration OK Partial : आप कैसे OnDataAvailable: 01 => type 1
Received message: 'audio.stream.response' Response request id:
'e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Response impression:
'b53b312c6dfd13609e5b1cf2952f0af6'
LanguageGeneration OK
Sending audio stream endpoint,
requestId='e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' Sent audio stream
endpoint, requestId='e9012ec9-fe3d-9ee9-e8a0-75e6274eda06' signaling
OnAudioEvent(AUDIO_EVENT_RECORD_STOP)
Then initialises new microphone client with "en-US".
Now when recognition starts:
Create ImpressionId: 0eed72b0b8019f0d7647b4d5d1adc8c6 Reset Canceling
request [cadbb055d5d4ef5c669d210a5fed2bf7] Canceling request
[e9012ec9fe3d9ee9e8a075e6274eda06]
Create ImpressionId: ff9306c014eba5a9da0fa5979269bced ImpressionId:
04bfd4c2fce0e631c6b6d9f3d16877f2 Adding requestId:
'b9148688143a0a9526df6bd9e31110d1' for 'text/cu.client.context'
Subscribing request [b9148688143a0a9526df6bd9e31110d1] Audio stream
created Adding requestId: '158b5857d7f60759687076b3bfa9d2bc' for
'audio/x-wav' Subscribing request [158b5857d7f60759687076b3bfa9d2bc]
Audio Stream Created Creating transcoder 2
Microphone permissions: 0 Speech recording started Speech recording
started Sent first chunk of audio stream,
requestId='158b5857-d7f6-0759-6870-76b3bfa9d2bc'
Sending audio stream endpoint,
requestId='158b5857-d7f6-0759-6870-76b3bfa9d2bc' Sent audio stream
endpoint, requestId='158b5857-d7f6-0759-6870-76b3bfa9d2bc' signaling
OnAudioEvent(AUDIO_EVENT_RECORD_STOP) Speech recording stopped Speech
recording stopped OnDataAvailable: 81 => type 1 Received message:
'audio.stream.response' Response request id:
'158b5857-d7f6-0759-6870-76b3bfa9d2bc' Response impression:
'04bfd4c2fce0e631c6b6d9f3d16877f2'
LanguageGeneration OK Partial : तो OnDataAvailable: 81 => type 1
Received message: 'audio.stream.response' Response request id:
'158b5857-d7f6-0759-6870-76b3bfa9d2bc' Response impression:
'04bfd4c2fce0e631c6b6d9f3d16877f2'
LanguageGeneration OK originating error 0x80070057 ERROR: No Reco
originating error 0x80070057
Couldn't find the locale in the log the second time and note that the partial responses are still in "hi-IN". Is there any way to remove old language configurations?

The websocket connection must be closed between one utterance and the next if you wish to change the language. Just waiting 3 minutes between utterances with no activity during the 3 minutes will close the connection. Also, calling AudioStop() should close the connection. If you already tried calling AudioStop() and that did not work, we will ensure this is fixed in upcoming versions of the released API .

Related

AVPlayer won't play audio files from FFMPEG

Before requesting audio data AVPlayer requests byte range 0-1 from FFMPEG.
FFMPEG gives a 200 response, but AVPlayer requires a 206 response.
This results in the request failing and audio can't be played.
Expected behavior:
Play tracks when streaming through ffmpeg
Current behavior: When trying to stream with ffmpeg we get "Operation Stopped"
Sample FFMPEG command:
ffmpeg -i "/path/to/audio/track.mp3" -vn -strict -2 -acodec pcm_u8 -f wav -listen 1 -seekable 1 http://localhost:8090/restream.wav
Player Log:
Error Domain=AVFoundationErrorDomain Code=-11850 "Operation Stopped" UserInfo={NSLocalizedFailureReason=The server is not correctly configured., NSLocalizedDescription=Operation Stopped, NSUnderlyingError=0x600003bcc4b0 {Error Domain=NSOSStatusErrorDomain Code=-12939 "(null)"}}
!av_interleaved_write_frame(): Broken pipe
!Connection to tcp://localhost:8090 failed: Connection refused
!Connection to tcp://localhost:8090 failed: Connection refused
!Connection to tcp://localhost:8090 failed: Connection refused
!Error writing trailer of http://localhost:8090/restream.wav: Broken pipe
This error is defined by Apple as:
+"The HTTP server sending the media resource is not configured as expected. This might mean that the server does not support byte range requests."
And summarised nicely in this StackOverflow post:
when AVPlayerItem receive a video URL , it do the following task:
Send a bytes request HTTP Request, and range = 0 -1
If the response code is 206 and return 1 bytes data, It do the 3th task, if not, AVErrorServerIncorrectlyConfigured error occurred.
continue send other HTTP Request, to download segment of All duration. and the response of VideoData code must be 206
In my situation , when send range[0-1] HTTP request, the server side give me a 200 OK response, So error occurred.
Network Log:
GET /file.wav HTTP/1.1
Host: localhost:1234
X-Playback-Session-Id: F72F1139-6F4C-4A22-B334-407672045A86
Range: bytes=0-1
Accept: */*
User-Agent: AppleCoreMedia/1.0.0.18C61 (iPhone; U; CPU OS 14_3 like Mac OS X; en_us)
Accept-Language: en-us
Accept-Encoding: identity
Connection: keep-alive
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Reproduce using this sample app:
This can also be reproduced using standard ffmpeg and adding URL to local or remote ffmpeg URL
Can we solve this by making changes to FFMPEG or AVPlayer?

I asked this question on the ffmpeg email list and it seems it's not possible:
If I understand your request correctly, you are not asking for a
change of a return type (you could probably do that yourself) but for
the implementation of byte range requests. I don't think this is
possible at all with FFmpeg.
Also it's not possible to get AVPlayer to ignore the byte range request:
it is an apple standard that media providers need to support http 1.1
with the range header (check out the iTunes store guidelines for
podcasts for example), so I wouldn't expect it anytime soon
SO Q'n: Is there a way to stop the avplayer sending a range http header field

ASP.NET Core 3.1 errors time out after Start processing HTTP request POST

I have a time out exception just after the "Start processing HTTP request POST" of HttpClientFactory
when does this message occurs exactly , is it before or after calling the server?
info: System.Net.Http.HttpClient.MyClient.LogicalHandler[100]
Start processing HTTP request GET https://api.github.com/repos/aspnet/docs/branches
ritical : System.Net.Http.HttpRequestException : Connection timed out

HTTP2 protocol with URLSessionStreamTask

I am trying to build iOS client side of Alexa Voice Services. I am stuck at the networking layer.
Interaction with Alexa Server requires creation of mainly two streams over a single connection. After creating connection with server, you open a stream downchannel which will be in half closed state. downchannel will be used by server to send directives such as notifications, alarms etc.(for e.g. if you ask Alexa to set a alarm after 5 mins, you will get directive to play alarm on this channel after 5 mins). downchannel will be open as long as the session with Alexa is active. Another stream will be started whenever user starts an audio session with Alexa. This stream will be used to send audio chunks until we get end directive.Then this stream will be closed. More Details here
I am trying to implement this using streamTaskWithHostName(:portNumber:) of URLSession. So for simple HTTP request if I am sending HTTP request, I can read the response and I just need to parse header and body as per the standards.
let request = "GET / HTTP/1.1 \r\nHost: localhost\r\n\r\n"
streamTask = session.streamTask(withHostName: "localhost", port: 8080)
let getRequest = request.data(using: .utf8)!
streamTask.write(getRequest, timeout: 60) {
error in
debugPrint("Error is \(error?.localizedDescription) )")
if error == nil {
self.streamTask.readData(ofMinLength: 4096, maxLength: 4096, timeout: 20) {
data, bool, error in
if let extractedData = data {
let dataString = String(data: extractedData, encoding: .utf8)
debugPrint("Data received is \(dataString!)")
}
debugPrint("Bool = \(bool) error = \(error?.localizedDescription)")
}
}
}
streamTask.resume()
Data what I am reading is
Data received is HTTP/1.1 200 \r\nDate: Tue, 26 Mar 2019 06:08:33 GMT\r\nServer: Apache/2.4.38 (Unix)\r\nContent-Length: 226\r\nConnection: close\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>it works</title>\n</head><body>\n<h1>it workst</h1>\n<p>it works.<br />\n</p>\n</body></html>\n"
But when i am trying to create http2 stream task, I can read nothing.
:method = GET
:scheme = https
:path = /{{API version}}/directives
authorization = Bearer {{YOUR_ACCESS_TOKEN}}
1.Is it because of the http2 is a binary protocol instead of plain text and header is compressed using HPACK?
2.If yes, then Will I need to implement header compression, frame creation and writing to stream as per documentation myself or is there some configuration in URLSessionConfiguration which will do this for me even if I specify plain headers as in URLRequests?
3.Can you help me with some libraries which can help me in achieving the same.

Not able to run Microsoft Bing Speech Recognition API on iOS device using iOS Client Sample provided by Microsoft

I was trying to explore Microsoft's Bing Speech Recognition API for iOS https://github.com/Microsoft/Cognitive-Speech-STT-iOS. I followed all the steps written in the read me. The app runs and it seems to be detecting the speech from microphone and sending it to the Microsoft server but it gives an error in the logs and the button disables itself without giving any text on the app.
Here are the logs that came in the console. Please help.
Application Name: com.Microsoft.SpeechRecognitionServerExample/1.0.1
Refreshing token /sts/v1.0/issueToken
Initializing Audio Services
Initializing Speech Services
No application id provided to controller
GetIdentityPropertyValue 3
Useragent Value iOS Assistant (iOS; 10.0.1;Mobile;ProcessName/AppName=com.Microsoft.SpeechRecognitionServerExample/1.0.1;DeviceType=Near;SpeechClient=1.0.160824)
Url: 'https://websockets.platform.bing.com/ws/speech/recognize'
Locale: 'en-us'
Application Id: ''
Version: 4.0.150429
UserAuthorizationToken:
ServerLoggingLevel: 1
Initiating websocket connection. m_connection=0x0 host=websockets.platform.bing.com port=443
Auth token status: 200
Authorization token hr 0 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzY29wZSI6Imh0dHBzOi8vc3BlZWNoLnBsYXRmb3JtLmJpbmcuY29tIiwic3Vic2NyaXB0aW9uLWlkIjoiZmNlYThiNTE3NGZmNDdkODk4ZWZiN2ZjYzNjZjM4MzMiLCJwcm9kdWN0LWlkIjoiQmluZy5TcGVlY2guUHJldmlldyIsImNvZ25pdGl2ZS1zZXJ2aWNlcy1lbmRwb2ludCI6Imh0dHBzOi8vYXBpLmNvZ25pdGl2ZS5taWNyb3NvZnQuY29tL2ludGVybmFsL3YxLjAvIiwiYXp1cmUtcmVzb3VyY2UtaWQiOiIiLCJpc3MiOiJ1cm46bXMuY29nbml0aXZlc2VydmljZXMiLCJhdWQiOiJ1cm46bXMuc3BlZWNoIiwiZXhwIjoxNDc1NDg3MDQ1fQ.4LF0gyhXU0T1iwDahlYlanKJ_wVjOOLhLyalFeDqIzA'
Successfully initialized client connection
Create ImpressionId: f43586374f8d8e455e48b090f2aaa5cd
Create ImpressionId: f627a603daa0cd4a16f59d9581118cdd
Reset
Create ImpressionId: dad2c09e82b8eaaacec3b20890de9bb8
ImpressionId: a9776c0ba8dc079c7d9658e6746a46ea
Adding requestId: 'bf17eb3310b2e39944d85dfe3d2868eb' for 'text/cu.client.context'
Subscribing request [bf17eb3310b2e39944d85dfe3d2868eb]
Waiting for connection/send completion.
Audio stream created
Adding requestId: 'dd3d699f39139326bedbb1dafd8816fb' for 'audio/x-wav'
Subscribing request [dd3d699f39139326bedbb1dafd8816fb]
Audio Stream Created
Creating transcoder 2
Microphone permissions: 0
Upgrade request returned with HTTP status code: 101.
Web socket handshake completed
CU Client connected
ConnectionStateChanged
Sent first chunk of audio stream, requestId='dd3d699f-3913-9326-bedb-b1dafd8816fb'
Received message: 'audio.stream.response'
Response request id: 'dd3d699f-3913-9326-bedb-b1dafd8816fb'
Response impression: 'a9776c0b-a8dc-079c-7d96-58e6746a46ea'
LanguageGeneration OK
Received message: 'audio.stream.response'
Response request id: 'dd3d699f-3913-9326-bedb-b1dafd8816fb'
Response impression: 'a9776c0b-a8dc-079c-7d96-58e6746a46ea'
LanguageGeneration OK
Received message: 'audio.stream.response'
Response request id: 'dd3d699f-3913-9326-bedb-b1dafd8816fb'
Response impression: 'a9776c0b-a8dc-079c-7d96-58e6746a46ea'
LanguageGeneration OK
Received message: 'audio.stream.response'
Response request id: 'dd3d699f-3913-9326-bedb-b1dafd8816fb'
Response impression: 'a9776c0ba8dc079c7d9658e6746a46ea'
Response Conversation: 'ab7082f3c4de8feb0d6210e8ec07dcb3'
LanguageGeneration OK
Sending audio stream endpoint, requestId='dd3d699f-3913-9326-bedb-b1dafd8816fb'
Sent audio stream endpoint, requestId='dd3d699f-3913-9326-bedb-b1dafd8816fb'
signaling OnAudioEvent(AUDIO_EVENT_RECORD_STOP)
originating error 0x80070057
Client UPL: 32880000 ticks
originating error 0x8000ffff
originating error 0x80070057
originating error 0x80004005
originating error 0x80004005
Failed to 'hresult', HR=80004005, WebSocket connection failed
No messages to retry, closing.
Closing web socket channel
CU Client connection dropped
ConnectionStateChanged
WebSocket closed unexpectedly, status: 0
Web socket channel already closed.

Uploading Multiple images on Box from iOS application

I am currently working on an iOS application in which i have to upload multiple images on Box,after authentication process. when i select more than 5 images at a time it always upload 5 images and more than 5 images stop uploading. Error messages comes on the log like::
[BoxAPIOperation connectionDidFinishLoading:]: BoxAPIOperation <BoxAPIMultipartToJSONOperation: 0xa64eb20> POST https://upload.box.com/api/2.1/files/content did finsh loading
2014-08-25 12:53:36.484 Pbm[13499:4903] -[BoxAPIOperation finish]: BoxAPIOperation <BoxAPIMultipartToJSONOperation: 0xa64eb20> POST https://upload.box.com/api/2.1/files/content finished with state 3
2014-08-25 12:53:36.485 Pbm[13499:4903] -[BoxAPIOperation connection:didFailWithError:]: BoxAPIOperation <BoxAPIMultipartToJSONOperation: 0xa642d00> POST https://upload.box.com/api/2.1/files/content did fail with error Error Domain=NSURLErrorDomain Code=-1021 "request body stream exhausted" UserInfo=0xa334970 {NSErrorFailingURLStringKey=https://upload.box.com/api/2.1/files/content, NSErrorFailingURLKey=https://upload.box.com/api/2.1/files/content, NSLocalizedDescription=request body stream exhausted, NSUnderlyingError=0xa32fb70 "request body stream exhausted"}
2014-08-25 12:53:36.485 Pbm[13499:4903] response : (null)
2014-08-25 12:53:36.485 Pbm[13499:4903] error : Error Domain=NSURLErrorDomain Code=-1021 "request body stream exhausted" UserInfo=0xa334970 {NSErrorFailingURLStringKey=https://upload.box.com/api/2.1/files/content, NSErrorFailingURLKey=https://upload.box.com/api/2.1/files/content, NSLocalizedDescription=request body stream exhausted, NSUnderlyingError=0xa32fb70 "request body stream exhausted"}
i am not getting the issue for "request body stream exhausted" . Please help me out if any body had done the uploading process on Box.
Thanks in advance.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Microsoft Speech API - Cognitive Speech STT iOS : Language not changing - ios

Related

AVPlayer won't play audio files from FFMPEG

ASP.NET Core 3.1 errors time out after Start processing HTTP request POST

HTTP2 protocol with URLSessionStreamTask

Not able to run Microsoft Bing Speech Recognition API on iOS device using iOS Client Sample provided by Microsoft

Uploading Multiple images on Box from iOS application

Categories

Resources