I am using opencv and openvino and am trying to figure out when I have a face detected, use the cv2.rectangle and have my coordinates sent but only on the first person bounded by the box so it can move the motors because when it sees multiple people it sends multiple coordinates and thus causing the servo and stepper motors to go crazy. Any help would be appreciated. Thank you
Generally, each code would run line by line. You'll need to create a proper function for each scenario so that the data could be handled and processed properly. In short, you'll need to implement error handling and data handling (probably more than these, depending on your software/hardware design). If you are trying to implement multiple threads of executions at the same time, it is better to use multithreading.
Besides, you are using 2 types of motors. Simply taking in all data is inefficient and prone to cause missing data. You'll need to be clear about what servo motor and stepper motor tasks are, the relations between coordinates, who will trigger what, if something fails or some sequence is missing then do task X, etc.
For example, the sequence of Data A should produce Result A but it is halted halfway because Data B went into the buffer and interfered with Result A and at the same time screwed Result B which was anticipated to happen. (This is what happened in your program)
It's good to review and design your whole process by creating a coding flowchart (a diagram that represents an algorithm). It will give you a clear idea of what should happen for each sequence of code. Then, design a proper handler for each situation.
Can you share more insights of your (pseudo-)code, please?
It sounds easy - you trigger a face-detection inference-request and you get a list/vector with all detected faces (the region-of-interest for each detected face) (including false-positive and false-positives, requiring some consistency-checks to filter those).
If you are interested in the first detected face only - then it could be to just process the first returned result from the list/vector.
However, you will see that sometimes the order of results might change, i.e. when 2 faces A and B were detected, in the next run it could still return faces, but B first and then A.
You could add object-tracking on top of face-detection to make sure you always process the same face.
(But even that could fail sometimes)
Related
What i'm doing is GPGPU on WebGL and I don't know the access pattern which I'd be talking about applies to general graphics and gaming programs. In our code, frequently, we come across data which needs to be summarized or reduced per output texel. A very simple example is matrix multiplication during which, for every output texel, your return a value which is a dot product of a row of one input and a column of the other input.
This has been the sore point of our performance because of not so much the computation but multiplied data access. So I've been trying to find a pattern of reads or data layouts which would expedite this operation and I have been completely unsuccessful.
I will be describing some assumptions and some schemes below. The sample code for all these are under https://github.com/jeffsaremi/webgl-experiments
Unfortunately due to size I wasn't able to use the 'snippet' feature of StackOverflow. NOTE: All examples write to console not the html page.
Base matmul implementation: Example: [2,3]x[3,4]->[2,4] . This produces in a simplistic form 2 textures of (w:3,h:2) and (w:4,h:3). For each output texel I will be reading along the X axis of the left texture but going along the Y axis of the right texture. (see webgl-matmul.html)
Assuming that GPU accesses data similar to CPU -- that is block by block -- if I read along the width of the texture I should be hitting the cache pretty often.
For this, I'd layout both textures in a way that I'd be doing dot products of corresponding rows (along texture width) only. Example: [2,3]x[4,3]->[2,4] . Note that the data for the right texture is now transposed so that for each output texel I'd be doing a dot product of one row from the left and one row from the right. (see webgl-matmul-shared-alongX.html)
To ensure that the above assumption is indeed working, I created a negative test also. In this test I'd be reading along the Y axis of both left and right textures which should have the worst performance ever. Data is pre-transposed so that the results make sense. Example: [3,2]x[3,4]->[2,4]. (see webgl-matmul-shared-alongY.html).
So I ran these -- and I hope you could do as well to see -- and I found no evidence to support existence or non-existence of such caching behavior. You need to run each example a few times to get consistent results for comparison.
Then I came along this paper http://fileadmin.cs.lth.se/cs/Personal/Michael_Doggett/pubs/doggett12-tc.pdf which in short claims that the GPU caches data in blocks (or tiles as I call them).
Based on this promising lead I created a version of matmul (or dot product) which uses blocks of 2x2 to do its calculation. Prior to using this of course I had to rearrange my inputs into such layout. The cost of that re-arrangement is not included in my comparison. Let's say I could do that once and run my matmul many times after. Even this scheme did not contribute anything to the performance if not taking something away. (see webgl-dotprod-tiled.html).
A this point I am completely out of ideas and any hints would be appreciated.
thanks
I have a channel that I want stop animations from happening if running on a slower device like Roku Express and keep them on a faster device like Roku Premiere. Except I'm not sure what's the best way to go about it.
I wanted to filter by the amount of available ram, but I couldn't find an api that gives me available ram for the system that I could run in my code.
I could filter by model name, but I would then need to keep an update list of model names, which I prefer not to do.
Any help/insight appreciated.
Re graphic capabilities, try roDeviceInfo.getGraphicsPlatform() - if it returns opengl, that high performing engine that can do arbitrary rotations vs directfb being limited.
Re CPU, you can run a mini benchmark on start of your program, something like
ti = createObject("roTimeSpan"): s=""
for i = 1 to 1000: s = s + right((i^3).toStr(),2): end for
time = ti.totalMilliSeconds()
Have you considered using Animation.optional=true?
It won't stop them from happening on Roku Express (since it is a Littlefield) but it will "skip animations on lower performing Roku devices (Paolo, Giga, Jackson, Tyler, and Sugarland)".
Animation also contains an undocumented field called "willBeSkipped" which will be true on slower devices when "optional" is set to true.
I had the similar problem with the animations. Unfortunately, You must filter by model name. I didn't find another way.
You can store the list of devices in database so it would be easier for You to maintain.
You can set the optional field on the animation node to true. This is supposed to take care of that. I have set this field to true before and it does not seem to have an effect. I'm sure they'll get around to fixing it eventually.
The efficiency of the animations also depends on how many animation nodes you have. You should only need 1 animation node to handle all of your animations for a particular component. Add an interpolator for each individual type of animation you want to occur (i.e. scaling, rotating, color-shifting, translating).
I've set up a bunch of sliders to manipulate the values of various GPUIImageFilters targeted by a GPUIImagePicture. My current chain order looks like this:
self.gpuImagePicture = [[GPUImagePicture alloc] initWithImage:self.image];
[self.gpuImagePicture addTarget:self.toneCurveFilter];
[self.toneCurveFilter addTarget:self.exposureFilter];
[self.exposureFilter addTarget:self.constrastFilter];
[self.constrastFilter addTarget:self.saturationFilter];
[self.saturationFilter addTarget:self.highlightShadowFilter];
[self.highlightShadowFilter addTarget:self.whiteBalanceFilter];
[self.whiteBalanceFilter addTarget:self.gpuImageView];
[self.whiteBalanceFilter setInputRotation:[self gpuImageRotationModeForImage:self.image] atIndex:0];
[self.gpuImagePicture processImage];
When I remove the tone curve filter everything works smoothly. If I use the tone curve filter alone I have no issues either. When I use the above implementation processing slows down tremendously.
Does the order of the filter-chaining matter when it comes to memory management and processing, or did adding the tone curve filter to the rest of the chain just push this setup over the edge?
EDIT:
I've realized it might be worth mentioning how the sliders change the filter values. If the exposure slider is moved, for example, it triggers this code:
[self.exposureFilter setExposure:sender.value];
[self.gpuImagePicture processImage];
Sometimes, filter order doesn't matter, but it usually does. Some color-adjustment operations work in such a way that they can be rearranged without changing the output, but most filter calculations will produce slightly to significantly different results if you rearrange them. Think about it like arithmetic where you change the order of operations or move some parentheses around.
Now, when it comes to performance or memory usage, order usually doesn't matter. Branching operations are the only case where this comes into play (having a filter that targets multiple outputs, that at some point are blended or combined into one output). You don't have that here, though.
You do have a number of steps in the above, and there is overhead in every filter you chain. However, I'm surprised you even see a difference in performance, because the bottleneck in the above should be the creation of the GPUImagePicture. Instantiating one of those is far slower than any filter you'd perform on it, due to the pass through Core Graphics needed to re-render and upload the picture as a texture.
If you are reusing your toneCurveFilter or others, make sure that they are fully disconnected from all their previous targets before using -addTarget: again. It's possible that you're switching out pictures while leaving all the filters connected to their previous targets, meaning that each new picture will keep adding targets. This will lead to tremendous slowdown.
I bet something like the above is what's slowing you down, but when in doubt fire up Time Profiler and / or the OpenGL profiler and see where you're really spending all your time.
I'm pretty much a noob when it comes to this kind of thing, so if you guys could either help me or direct me to a place to learn what I need to know, I would greatly appreciate it.
Basically my problem is that I am using the libpruio library to continuously sample analog values from the board. 2 things are going wrong here.
The first is that whenever the BB is sampling the voltages, the voltage of the wire that is hooked up to the AIN pin goes up. I've observed this through hooking up an oscilloscope to the same wire the pin is sampling. What I see is that whenever the BB starts sampling, the entire signal (just a sound wave from an amplified mic) is shifted up .8-.9 volts. This is also reflected in the values that I get from the BB, which are around 30,000 (when they should be 0). Hooking the pin up to ground gets me 0, which is correct, and hooking it up to 1.8 volts gets me something like 65520, which is also correct. So maybe it has something to do with the signal being weak?
The second issue is that even though I am receiving values at a rate of like 500khz-900khz, the actual rate seems to be around 11khz. What I mean by this is I only get a new value every 88us, and the rest of the values I get are stay the same as the new value until the next 88us passes, when I get a new value. These times correspond to the voltage shift up, which I mentioned in the previous paragraph. So actually what I see on the oscilloscope is that whenever I sample with the BB, there is a saw wave, with the frequency at the 11khz I was mentioning earlier.
In conclusion, whenever the BB samples, it first increases the voltage at the pin by .9volts, takes a sample of that voltage, and the voltage dies down for the next 88us, all the while the BB spits back the sample it took at the beginning of the period. I do not want this. I want it to not affect the voltage significantly, and take new samples every time the code runs.
As for the code I'm using, it's basically a slightly modified version of the IO_Input example in the libpruio library, with the values being stored in an array for later use instead of being printed immediately.
If you guys need any more information, I will gladly post it here, but for now I'm wondering if it is something super obvious that I'm missing.
Hooking the pin up to ground gets me 0, which is correct, and hooking
it up to 1.8 volts gets me something like 65520, which is also
correct. So maybe it has something to do with the signal being weak?
The BBB and libpruio seem to work OK. Check your wiring.
Regarding the sampling rate, the io_input example uses IO mode. If you need accurate timing for the samples use MM mode or RB mode.
Your target isn't very clear, so I cannot give detailed advices. (Some code also would help to understand what you're trying to do.)
BR
I have a stream of numbers (integers for the sake of discussion) being sampled off an analog input (a a/d converter attached to a potentiomeger). I am curious how I would recognize a pattern in the numbers in realtime.
That is to say, if someone quickly twiddles the pot all the way up and back down, how do I recognize that, vs if turn it only half way. Or what if they turn it up and down three times in a row. How can I convert these actions into distinct "events"? This seems especially tricky to me since the time window over which each of these events will occur will be modestly variable.
I can think of a few quick, hacky ways to do this, but nothing that I am confident in. I am also curious how one would expand this out to multiple different inputs (i.e. input off a spectrograph). Does that change things dramatically? I am not even sure what topic area I should be googling.
If you know what you are looking for, correlate the input signal against a replica of what you expect. Basically, implement a matched filter. If you want to see when the input stream is -127, -63, 0, 63, 127, implement a direct form fir filter with these values as the coefficients. Then look for a maximum on the output. The maximum output of a filter with those coefficients occurs when the data in the filter is -127, -63, 0, 63, 127.
Google "Matched Filter Detection" or or "detection theory" maybe even "Feature detection"
If you don't know exactly what you are looking for, or what you are looking for is variable, it gets more complicated. You would then try to implement a filter that's output would give you information about what is going on. The example that I gave above would show the output spike up when that input sequence occurred. If you then saw that spike occurring with regular frequency, you would guess that the input event was occurring with regular frequency.
if you made your filter 0, 63, 127 63 0, which correlates to turning the knob all the way up, and then back down again, and on your output saw the aforementioned spike occur, but having a lower maximum amplitude and wider time over which the correlation occurs, that might tell you that the know was turned all the way up and then back down, but either slower or faster than the speed for which the filter is design to get a maximum response.
To combat this you might implement 3 of these filters in parallel, one designed for a slow knob turn, one for a medium speed knob turn, and one for a fast knob turn. Then looking at the 3 outputs you get 3 different correlations which better help you understand what is occurring
Did you consider taking the running difference of the signal (its differentiation)?