Distingushing between multiple cameras using OpenCV - opencv

I'm trying to run a usb camera (webcam) and a leap-motion 'camera' under OpenCV
code:
cv:VideoCapture st0 = cv:VideoCapture();
cv:VideoCapture st1 = cv:VideoCapture();
isOpen0 = st0.open(0);
isOpen1 = st1.open(1);
Problem is that each time the program loads the webcam and leap-motion gets different indexes and I could not find a place where I can get info on the cameras like a description string, serial number, manufacturer or any other parameter that can help me distinguish between the two.

Related

Rasberry Pi Pico saves data in multiple txt files when connected to an external power source

I have a temperature sensor connected to rasberry pi pico and have a main.py, which records the temperature data into .txt file. When I connect the rasberry pi to PC via USB port, it starts recording and saves the data in one .txt file. However, when I connect it to an external power supply (AC DC adapter with output 5V), the data is saved in multiple .txt files of different sizes (named logfile_0001.txt, logfile_0002.txt, ... as expected).
Main.py starts running whenever the rasberry pi is connected to the external power supply, and I can confirm that by the LED, which I programmed it to blink when it starts collecting data and after writing the first three lines.
The weird thing is, I never observe LED blinking more than the initial blink right when it is connected to the external power, even though it should whenever it writes the three header lines. Each separate txt file does contain those three lines as headers. I am very confused how the file can have the three lines without LED blinking, too.
Here is the code after setting up the i2c using machine module in micropython:
#naming the file
num = 0
for file in os.ilistdir():
file_name = file[0]
if not file[0].startswith('logfile'):
continue
temp = int(file_name[8:13])
if temp > num:
num = temp
num += 1
file_name = f'logfile_{num:05d}.txt'
with open(file_name,'w') as f:
f.write('Humidity and temperature data taken using T9602 Humidity & Temperature Sensor, on Rasberry Pi Pico\n')
f.write(str(time.localtime(start_time)) +'\n')
f.write('unixTime Humidity1(%) Temperature1(C) Humidity2(%) Temperature2(C)\n')
#LED blinks when connected
if True:
led_onboard.value(1)
for i in range(3):
led_onboard.value(0)
utime.sleep(0.3)
led_onboard.value(1)
utime.sleep(0.3)
led_onboard.value(0)
while True:
#turn on LED when recording data
led_onboard.value(1)
#communicate with two i2c devices (sensors)
i2c.writeto(address[0],b'1')
i2c2.writeto(address2[0],b'1')
#calculating temperatures based on the data read
RH1, TH1 = data_calc(i2c.readfrom(address[0],4))
with open(file_name, 'a') as f:
f.write(f'{time.time()} {RH1} {TH1} {RH2} {TH2}\n')
print(f'{time.localtime(time.time())[:-2]} {RH1} {TH1} {RH2} {TH2}')
time.sleep(1)
If someone have any insights, I would greatly appreciate it! Thank you so much.

polyline.encode strange format

I am working with the polyline of google, I would like to give a set of coordinates and generate the correct polyline and viceversa. In particular in the end i would like to url encode the result (the polyline).
When I insert a polyline like:
code = '%28%28akntGkozv%40kcCka%40us%40y%7BDfvAm%7BBnuCj_Aus%40fzG%29%29'
I use the polyline package: https://pypi.org/project/polyline/, and first I decode the polyline in order to see the coordinates:
coordinates = polyline.decode(code)
print(coordinates)
>> [(3e-05, -0.0001), (-0.0001, -7e-05), (-0.0002, -0.0002), (45.46221, 35.36626), (45.4621, 35.36617), (45.48328, 35.39727), (45.48317, 35.39718), (45.5172, 35.39707), (45.51711, 35.39816), (45.51723, 35.39814), (45.5172, 35.38418), (45.51823, 35.3843), (45.51821, 35.38428), (45.49413, 35.37398), (45.52816, 35.37387), (45.52807, 35.32855), (45.5281, 35.32845), (45.52823, 35.32848), (45.52813, 35.32861)]
and everything here is fine, the problems comes when I try to encode the coordinates back to the polyline (which is my ultimate goal since in the end i would like to give some coordinates and obtain the corresponding polyline)
new_code = polyline.encode(coordinates)
print(new_code)
>> ERXERXakntGkozvETPkcCkaETPusETPyEWBDfvAmEWBBnuCj_AusETPfzGERYERY
Which is slightly different from the original and if put back in the url it doesnt work!
So my question here are:
what kind of encoding is new_code? I have tried to encode it in percentage url using urllib.parse.quote(new_code) but the result is exactly the same, maybe I neeed to specify some particular encoding style but i didnt found anything.
The polyline that I used is a square inside the city of Milan (so only 4 points, maximum 5, are required to identify this area), but the coordinates results from the polyline.decode gives me back a list with 19 points with coordinates that are not even close to the city of Milan. Why?
Ok so basically all of my problems came from the fact that the string i was considering: %28%28akntGkozv%40kcCka%40us%40y%7BDfvAm%7BBnuCj_Aus%40fzG%29%29
contains %28%28 and %29%29 which are not part of the polyline but are simply two (( and )) inserted by the particular url of the site I was using. A simple replace and an encode return the correct polyline:
code = '%28%28akntGkozv%40kcCka%40us%40y%7BDfvAm%7BBnuCj_Aus%40fzG%29%29'
code = code.replace('%28', '').replace('%29', '')
code = urllib.parse.unquote(code)
print(code)
>> irotG_hzv#woBmE}i#yjE`oBwkDf|ChRhMzeG}~BxcB
Which infact, if put inside the polyline.decode returns exactly the coordinates that I have used:
coordinates = polyline.decode(code)
print(coordinates)
>> [(45.46869, 9.15088), (45.48673, 9.15191), (45.4936, 9.18452), (45.47567, 9.21216), (45.45051, 9.20907), (45.44822, 9.16701), (45.46869, 9.15088)]
Which are exactly 7 (now i have changed the shape so a sixtagon instead of a square) and points exactly in the city of Milan

AVAudioEngine reconcile/sync input/output timestamps on macOS/iOS

I'm attempting to sync recorded audio (from an AVAudioEngine inputNode) to an audio file that was playing during the recording process. The result should be like multitrack recording where each subsequent new track is synced with the previous tracks that were playing at the time of recording.
Because sampleTime differs between the AVAudioEngine's output and input nodes, I use hostTime to determine the offset of the original audio and the input buffers.
On iOS, I would assume that I'd have to use AVAudioSession's various latency properties (inputLatency, outputLatency, ioBufferDuration) to reconcile the tracks as well as the host time offset, but I haven't figured out the magic combination to make them work. The same goes for the various AVAudioEngine and Node properties like latency and presentationLatency.
On macOS, AVAudioSession doesn't exist (outside of Catalyst), meaning I don't have access to those numbers. Meanwhile, the latency/presentationLatency properties on the AVAudioNodes report 0.0 in most circumstances. On macOS, I do have access to AudioObjectGetPropertyData and can ask the system about kAudioDevicePropertyLatency, kAudioDevicePropertyBufferSize,kAudioDevicePropertySafetyOffset, etc, but am again at a bit of a loss as to what the formula is to reconcile all of these.
I have a sample project at https://github.com/jnpdx/AudioEngineLoopbackLatencyTest that runs a simple loopback test (on macOS, iOS, or Mac Catalyst) and shows the result. On my Mac, the offset between tracks is ~720 samples. On others' Macs, I've seen as much as 1500 samples offset.
On my iPhone, I can get it close to sample-perfect by using AVAudioSession's outputLatency + inputLatency. However, the same formula leaves things misaligned on my iPad.
What's the magic formula for syncing the input and output timestamps on each platform? I know it may be different on each, which is fine, and I know I won't get 100% accuracy, but I would like to get as close as possible before going through my own calibration process
Here's a sample of my current code (full sync logic can be found at https://github.com/jnpdx/AudioEngineLoopbackLatencyTest/blob/main/AudioEngineLoopbackLatencyTest/AudioManager.swift):
//Schedule playback of original audio during initial playback
let delay = 0.33 * state.secondsToTicks
let audioTime = AVAudioTime(hostTime: mach_absolute_time() + UInt64(delay))
state.audioBuffersScheduledAtHost = audioTime.hostTime
...
//in the inputNode's inputTap, store the first timestamp
audioEngine.inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (pcmBuffer, timestamp) in
if self.state.inputNodeTapBeganAtHost == 0 {
self.state.inputNodeTapBeganAtHost = timestamp.hostTime
}
}
...
//after playback, attempt to reconcile/sync the timestamps recorded above
let timestampToSyncTo = state.audioBuffersScheduledAtHost
let inputNodeHostTimeDiff = Int64(state.inputNodeTapBeganAtHost) - Int64(timestampToSyncTo)
let inputNodeDiffInSamples = Double(inputNodeHostTimeDiff) / state.secondsToTicks * inputFileBuffer.format.sampleRate //secondsToTicks is calculated using mach_timebase_info
//play the original metronome audio at sample position 0 and try to sync everything else up to it
let originalAudioTime = AVAudioTime(sampleTime: 0, atRate: renderingEngine.mainMixerNode.outputFormat(forBus: 0).sampleRate)
originalAudioPlayerNode.scheduleBuffer(metronomeFileBuffer, at: originalAudioTime, options: []) {
print("Played original audio")
}
//play the tap of the input node at its determined sync time -- this _does not_ appear to line up in the result file
let inputAudioTime = AVAudioTime(sampleTime: AVAudioFramePosition(inputNodeDiffInSamples), atRate: renderingEngine.mainMixerNode.outputFormat(forBus: 0).sampleRate)
recordedInputNodePlayer.scheduleBuffer(inputFileBuffer, at: inputAudioTime, options: []) {
print("Input buffer played")
}
When running the sample app, here's the result I get:
This answer is applicable to native macOS only
General Latency Determination
Output
In the general case the output latency for a stream on a device is determined by the sum of the following properties:
kAudioDevicePropertySafetyOffset
kAudioStreamPropertyLatency
kAudioDevicePropertyLatency
kAudioDevicePropertyBufferFrameSize
The device safety offset, stream, and device latency values should be retrieved for kAudioObjectPropertyScopeOutput.
On my Mac for the audio device MacBook Pro Speakers at 44.1 kHz this equates to 71 + 424 + 11 + 512 = 1018 frames.
Input
Similarly, the input latency is determined by the sum of the following properties:
kAudioDevicePropertySafetyOffset
kAudioStreamPropertyLatency
kAudioDevicePropertyLatency
kAudioDevicePropertyBufferFrameSize
The device safety offset, stream, and device latency values should be retrieved for kAudioObjectPropertyScopeInput.
On my Mac for the audio device MacBook Pro Microphone at 44.1 kHz this equates to 114 + 2404 + 40 + 512 = 3070 frames.
AVAudioEngine
How the information above relates to AVAudioEngine is not immediately clear. Internally AVAudioEngine creates a private aggregate device and Core Audio essentially handles latency compensation for aggregate devices automatically.
During experimentation for this answer I've found that some (most?) audio devices don't report latency correctly. At least that is how it seems, which makes accurate latency determination nigh impossible.
I was able to get fairly accurate synchronization using my Mac's built-in audio using the following adjustments:
// Some non-zero value to get AVAudioEngine running
let startDelay = 0.1
// The original audio file start time
let originalStartingFrame: AVAudioFramePosition = AVAudioFramePosition(playerNode.outputFormat(forBus: 0).sampleRate * startDelay)
// The output tap's first sample is delivered to the device after the buffer is filled once
// A number of zero samples equal to the buffer size is produced initially
let outputStartingFrame: AVAudioFramePosition = Int64(state.outputBufferSizeFrames)
// The first output sample makes it way back into the input tap after accounting for all the latencies
let inputStartingFrame: AVAudioFramePosition = outputStartingFrame - Int64(state.outputLatency + state.outputStreamLatency + state.outputSafetyOffset + state.inputSafetyOffset + state.inputLatency + state.inputStreamLatency)
On my Mac the values reported by the AVAudioEngine aggregate device were:
// Output:
// kAudioDevicePropertySafetyOffset: 144
// kAudioDevicePropertyLatency: 11
// kAudioStreamPropertyLatency: 424
// kAudioDevicePropertyBufferFrameSize: 512
// Input:
// kAudioDevicePropertySafetyOffset: 154
// kAudioDevicePropertyLatency: 0
// kAudioStreamPropertyLatency: 2404
// kAudioDevicePropertyBufferFrameSize: 512
which equated to the following offsets:
originalStartingFrame = 4410
outputStartingFrame = 512
inputStartingFrame = -2625
I may not be able to answer your question, but I believe there is a property not mentioned in your question that does report additional latency information.
I've only worked at the HAL/AUHAL layers (never AVAudioEngine), but in discussions about computing the overall latencies, some audio device/stream properties come up: kAudioDevicePropertyLatency and kAudioStreamPropertyLatency.
Poking around a bit, I see those properties mentioned in the documentation for AVAudioIONode's presentationLatency property (https://developer.apple.com/documentation/avfoundation/avaudioionode/1385631-presentationlatency). I expect that the hardware latency reported by the driver will be there. (I suspect that the standard latency property reports latency for an input sample to appear in the output of a "normal" node, and IO case is special)
It's not in the context of AVAudioEngine, but here's one message from the CoreAudio mailing list that talks a bit about using the low level properties that may provide some additional background: https://lists.apple.com/archives/coreaudio-api/2017/Jul/msg00035.html

Simple Babymonitor with Bass.DLL

I am trying to program a simple Babymonitor for Windows (personal use).
The babymonitor should just detect the dB level of the microphone and triggers at a certain volume.
After some research, I found the Bass.dll library and came across it's function BASS_ChannelGetLevel, which is great but seems to have limitations and doesn't fit my needs (Peak equals to a DWORD value).
In the examples I found a livespec example which is "almost" what I need. The example uses BASS_ChannelGetData, but I don't quite know how to handle the returned array...
I want to keep it as simple as possible: Detect the volume from the microphone as dB or any other value (e.g. value 0-MAXINT).
How can this be done with the Bass.dll library?
The BASS_ChannelGetLevel returns the value that is capped to 0dB (return value is 32768 in this case). If you adjust your source level (lower microphone level in sound card settings) then it will work just fine.
Another way, if you want to get uncapped value is to use the BASS_ChannelGetLevelEx function instead: it returns floating point levels, where 1 is maximum (0dB) value that corresponds to BASS_ChannelGetLevel's 32767, but it can exceed 1 to detect sound levels above 0dB which is what you may need.
I also suggest you to monitor sound level for a while: trigger only if certain level exists for 2-3 seconds at least (this way you will exclude false alarms).
Here is how you obtain the db level given an input stream handle (streamHandle):
var peak = (double)Bass.BASS_ChannelGetLevel(streamHandle);
var decibels = 20 * Math.Log10(peak / Int32.MaxValue);
Alternatively, you can use the following to get the RMS (average) peak. To get the RMS value, you have to pass in a sample length into BASS_ChannelGetLevel. I'm using 20 milliseconds here but you can play with the value to see which works best for your needs.
var decibels = 0m;
var channelCount = 2; //Assuming two channels
var sampleLengthMS = 20f;
var rmsLevels = new float[channelCount];
var rmsObtained = Bass.BASS_ChannelGetLevel(streamHandle, rmsLevels, sampleLengthMS / 1000f, BASSLevel.BASS_LEVEL_RMS);
if (rmsObtained)
decibels = 20*Math.Log10(rmsLevels[0]); //using first channel (index 0) but you can get both if needed.
else
Console.WriteLine(Bass.BASS_ErrorGetCode());
Hope this helps.

cvConvertScale error

I am new to Python and openCV so if this is simple I apologize in advance.
I am trying to follow the depth map code at http://altruisticrobot.tistory.com/219.
In the code below ConvertScale raises the following error:
src.size == dst.size && src.channels() == dst.channels()
I have spent a couple days and cant figure out why.
Any pointers would be greatly appreciated.
Thanks in advance.
im_l = cv.LoadImage('C:\Python27\project\captureL6324.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
im_r = cv.LoadImage('C:\Python27\project\captureR6324.png',cv.CV_LOAD_IMAGE_GRAYSCALE)
imsize = cv.GetSize(im_l)
disp = cv.CreateImage(imsize,cv.IPL_DEPTH_16S,1)# to receive Disparity
#run stereo Correspondence-- returns a single channel 16 bit signed disparity
disparity = cv.FindStereoCorrespondenceBM(im_r,im_l,disp,cv.CreateStereoBMState())
#convert to a real disparity by dividing it by 16-- first create variable to hold the converted scale
real_disparity = cv.CreateImage(imsize,cv.IPL_DEPTH_16S,1)
cv.ConvertScale(disparity, real_disparity, -16,0)
#get the point cloud
depth = cv.CreateImage(cv.GetSize(im_l),cv.IPL_DEPTH_32F,3)
cv.ReprojectIMageTo3D(real_disparity,depth, ReprojectMatrix)
Your problem is, that FindStereoCorrespondenceBM saves the result in the third parameter but you are using the return value, which is None for further computation. This leads to an error since a matrix of a certain size and type is expected.
So just change
cv.ConvertScale(disparity, real_disparity, -16,0)
to
cv.ConvertScale(disp, real_disparity, -16,0)
To run the whole script I also changed the last line to
ReprojectMatrix = cv.CreateMat(4,4,cv.CV_32FC1);
cv.ReprojectImageTo3D(real_disparity,depth, ReprojectMatrix)
This script runs without error for me. But I have not checked whether it gives the correct result.

Resources