Other causes for DirectShow "no combination of intermediate filters could be found" errors? - delphi

I have a Delphi 6 application that uses the DSPACK DirectShow component library. Currently I am getting the error "no combination of intermediate filters could be found" when I attempt to connect the Capture pin on an audio capture device to the Input pin of another filter. I believe I am setting the media formats correctly. I have an error trap and in that trap I query explicitly both pins for the exact media format they are set to in case there is an incongruity. When I do this, both pins come back with the exact same WAV format:
format tag: 1
number of channels: 1
bits per sample: 16
sample rate: 8000
That matches up to what I set both filters to, yet I am getting an error that (usually as far as I know) indicates a format incompatibility. Has anyone run into this error before and knows what I might be doing wrong or what other kinds of tests/inspections I can do?

It turns out the error was being caused by the media format I was returning from my push source audio filter. I had the wrong sub-type and that was triggering the "no combination of intermediate filters could be found" error from DirectShow since the sub-type I was using in my push source filter was incorrect and not compatible with other filters like the Capture filter I was using in my filter graph. See the "UPDATE" note in my thread on media format's for full details:
Correct Media Type settings for a DirectShow filter that delivers Wav audio data?

Related

Can't get stats while using sout with libvlc

I want to display a stream, and monitor the video stats of the displayed stream while recording it using libvlc. When I use sout + duplicate to record the stream while displaying it, I can only get the demux_bitrate stat from the displayed stream using libvlc_media_get_stats function. I am looking to get decoded_video, displayed_pictures, etc as well.
I've tried using the duplicate module to try to make this happen but I can't seem to make this work - am not sure if what I want to do is supported. Code below is tweaked from https://wiki.videolan.org/Documentation:Modules/display/ example for transcoding a stream while displaying the original version.
:sout=#duplicate{dst='transcode{vcodec=h264}:std{access=file,mux=ts,dst=c:\junk\test.mp4}',dst=display}
The stream displays, the file is generated, but the only valid stat is demux_bitrate which seems like the stat that would be accessible from the non-display stream instead of the displayed version.
Display and save with Transcoding
:sout=#duplicate{dst=display,dst="transcode{vcodec=h264}:standard{access=file,mux=mp4,dst=c:\junk\test.mp4}"}
Display and save without Transcoding
:sout=#duplicate{dst=display,dst=standard{access=file,mux=mp4,dst=c:\junk\test.mp4}}

Google cloud speech very inaccurate and misses words on clean audio

I am using Google cloud speech through Python and finding many transcriptions are inaccurate and missing several words. This is a simple script I'm using to return a transcript of an audio file, in this case 'out307.wav':
client = speech.SpeechClient()
with io.open('out307.wav', 'rb') as audio_file:
content = audio_file.read()
audio = speech.types.RecognitionAudio(content=content)
config = speech.types.RecognitionConfig(
enable_word_time_offsets=True,
language_code='en-US',
audio_channel_count=1)
response = client.recognize(config, audio)
for result in response.results:
alternative = result.alternatives[0]
print(u'Transcript: {}'.format(alternative.transcript))
This returns the following transcript:
to do this the tensions and suspicions except
This is very far off what the actual audio says (I've uploaded it at https://vocaroo.com/i/s1zdZ0SOH1Ki). The audio is a .wav and very clear with no background noise. This is worse than average, as in some cases it will get the transcription fully correct on a 10 second audio file, or it may miss just a couple of words. Is there anything I can do to improve results?
This is weird, I tried your audio file with your code and I get the same result, but, if I change the language_code to "en-UK" I am able to get the full response.
I'm working for Google Cloud and I created for you a public issue here, you can track there the updates.

Value of type 'AVCaptureFileOutput' has no member 'delegate'

The documentation https://developer.apple.com/reference/avfoundation/avcapturefileoutput indicates a delegate property exists for AVCaptureFileOutput.
But the following code
let vfo = AVCaptureFileOutput()
vfo.delegate = self
give the error "Value of type 'AVCaptureFileOutput' has no member 'delegate'"
I am looking to use a AVCaptureFileOutputDelegate for a AVCaptureMovieFileOutput instance.
Any pointer will be helpful.
Follow the link to the delegate property on the page you quoted (or look at the #ifs around it in the header file), and you'll notice that property is for macOS only, not iOS. Thus, when you're in a project targeting iOS, that property doesn't exist.
iOS doesn't let you both receive sample buffers during capture and record to a file with the same session -- you can have an AVCaptureVideoDataOutput or an AVCaptureMovieFileOutput, but not both. If you just want delegate callbacks about movie file capture progress, use startRecording(toOutputFileURL:recordingDelegate:) and adopt AVCaptureFileOutputRecordingDelegate instead. If you want sample buffers, use AVCaptureVideoDataOutput to receive them and AVAssetWriter for lower-level file output.
Thank you for the pointer to AVAssetWriter. I was able to find RosyWriter sample https://developer.apple.com/library/content/samplecode/RosyWriter/Introduction/Intro.html. The modified CaptureOutput:didOutputSampleBuffer to capture the audio averagePowerLevel did the trick of getting a recorded movie and getting simultaneous audio levels.
But is there a more striped down example of its use? My attempts to strip out the renderers, which do the video manipulation, have only broken the sample.

What to do with NIL response for legacy filter 'Wav Dest' in Delphi 6 DSPACK program?

I am trying to create a Delphi 6 program with DSPACK that records audio from the PC input devices (Windows XP) and then writes the captured audio to a MS format WAV file. The problem I am having is that I am getting NIL back when I try to get the legacy filter named 'WAV Dest':
CapEnum.SelectGUIDCategory(CLSID_LegacyAmFilterCategory);
filWaveDest.BaseFilter.Moniker := CapEnum.GetMoniker(CapEnum.FilterIndexOfFriendlyName('WAV Dest'));
filWaveDest.BaseFilter.Moniker contains NIL after these calls. How can I correct this since obviously subsequent code that attempts to write the WAV data captured using filWaveDest fails?
Wav Dest is not a standard DirectShow filter. It is an example filter in the SDK. Either build the object or download a copy of the DLL someone else has built.

iOS Audio Units : When is usage of AUGraph's necessary?

I'm totally new to iOS programing (I'm more an Android guy..) and have to build an application dealing with audio DSP. (I know it's not the easiest way to approach iOS dev ;) )
The app needs to be able to accept inputs both from :
1- built-in microphone
2- iPod library
Then filters may be applied to the input sound and the resulting is to be outputed to :
1- Speaker
2- Record to a file
My question is the following : Is an AUGraph necessary in order to be able for example to apply multiple filters to the input or can these different effects be applied by processing the samples with different render callbacks ?
If I go with AUGraph do I need : 1 Audio Unit for each input, 1 Audio Unit for the output and 1 Audio Input for each effect/filter ?
And finally if I don't may I only have 1 Audio Unit and reconfigure it in order to select the source/destination ?
Many thanks for your answers ! I'm getting lost with this stuff...
You may indeed use render callbacks if you so wished to but the built in Audio Units are great (and there are things coming that I can't say here yet under NDA etc., I've said too much, if you have access to the iOS 5 SDK I recommend you have a look).
You can implement the behavior you wish without using AUGraph, however it is recommended you do as it takes care of a lot of things under the hood and saves you time and effort.
Using AUGraph
From the Audio Unit Hosting Guide (iOS Developer Library):
The AUGraph type adds thread safety to the audio unit story: It enables you to reconfigure a processing chain on the fly. For example, you could safely insert an equalizer, or even swap in a different render callback function for a mixer input, while audio is playing. In fact, the AUGraph type provides the only API in iOS for performing this sort of dynamic reconfiguration in an audio app.
Choosing A Design Pattern (iOS Developer Library) goes into some detail on how you would choose how to implement your Audio Unit environment. From setting up the audio session, graph and configuring/adding units, writing callbacks.
As for which Audio Units you would want in the graph, in addition to what you already stated, you will want to have a MultiChannel Mixer Unit (see Using Specific Audio Units (iOS Developer Library)) to mix your two audio inputs and then hook up the mixer to the Output unit.
Direct Connection
Alternatively, if you were to do it directly without using AUGraph, the following code is a sample to hook up Audio units together yourself. (From Constructing Audio Unit Apps (iOS Developer Library))
You can, alternatively, establish and break connections between audio
units directly by using the audio unit property mechanism. To do so,
use the AudioUnitSetProperty function along with the
kAudioUnitProperty_MakeConnection property, as shown in Listing 2-6.
This approach requires that you define an AudioUnitConnection
structure for each connection to serve as its property value.
/*Listing 2-6*/
AudioUnitElement mixerUnitOutputBus = 0;
AudioUnitElement ioUnitOutputElement = 0;
AudioUnitConnection mixerOutToIoUnitIn;
mixerOutToIoUnitIn.sourceAudioUnit = mixerUnitInstance;
mixerOutToIoUnitIn.sourceOutputNumber = mixerUnitOutputBus;
mixerOutToIoUnitIn.destInputNumber = ioUnitOutputElement;
AudioUnitSetProperty (
ioUnitInstance, // connection destination
kAudioUnitProperty_MakeConnection, // property key
kAudioUnitScope_Input, // destination scope
ioUnitOutputElement, // destination element
&mixerOutToIoUnitIn, // connection definition
sizeof (mixerOutToIoUnitIn)
);

Resources