What is being stored and where when you use cv2.VideoCapture()?

What is being stored and where when you use cv2.VideoCapture()? - opencv

If I were to turn on my camera with cv2.VideoCapture(0), without any other code to read or save frames, is there anything being temporarily stored (i.e. in the heap)? If I do not release and have this process run continuously, will memory be built up until it is out of memory space?
Context:
I am currently using a Raspberry Pi camera and am thinking about using cv2.VideoCapture() followed by read() to replace PiCamera().capture() for the purpose of speed and efficiency. To put in other words, I am considering taking a frame from a video stream to use in place of a photo. As I consider the pros & cons, I came across this question. Any comments and/or advice is appreciated.
Note that I am also new to stack overflow, so if I apologize in advance for anything unconventional in this posting.

Related

Using Vulkan output in electron

I want to use Electron as a debug overlay for a Vulkan Render Engine im building. Since i have a lot of requirements on this debug tool writing one in engine myself would take way too long. I would like to use electron instead of Qt or similar since i feel its a lot more powerful and flexible with less effort (once its working).
The problem is now that i somehow either have to get my render output to electron or electrons output to my engine. As far as i can tell the easiest solution would be to copy the data back to cpu then transfer it. But that would be extremely slow and cost a lot of bandwidth. So i was wondering if there is a better solution.
I have two ideas to make it work but i didnt find any ways to implement them or even anyone talking about it.
The first would be to have electron configured to run on the gpu somehow get the handle for the output texture and importing it into my render engine using vulkan external memory. However as i have no experience with chromium and there doesnt seem to be anyone else that did it this i dont think it would work out to well.
The second idea was to do the opposite. Using a canvas element with webgl and again using vulkan external memory to copy the output of my engine to a texture and displaying it. I have full control over the draw process here so i think it would be a lot simpler and more stable. However again i found no way of setting up a webGL texture handle as an external memory object.
Is there any better way of doing this or some help on how to implement it?

Getting FPS and frame-time info from a GPU

I am a mathematician and not a programmer, I have a notion on the basics of programming and am a quite advanced power-user both in linux and windows.
I know some C and some python but nothing much.
I would like to make an overlay so that when I start a game it can get info about amd and nvidia GPUs like frame time and FPS because I am quite certain the current system benchmarks use to compare two GPUs is flawed because small instances and scenes that bump up the FPS momentarily (but are totally irrelevant in terms of user experience) result in a higher average FPS number and mislead the market either unintentionally or intentionally (for example, I cant remember the name of the game probably COD there was a highly tessellated entity on the map that wasnt even visible to the player which lead AMD GPUs to seemingly under perform when roaming though that area leading to lower average FPS count)
I have an idea on how to calculate GPU performance in theory but I dont know how to harvest the data from the GPU, Could you refer me to api manuals or references to help me making such an overlay possible?
I would like to study as little as possible (by that I mean I would like to learn what I absolutely have to learn in order to get the job done I dont intent to become a coder).
I thank you in advance.

It is generally what the Vulkan Layer system is for, which allows to intercept API commands and inject your own. But it is nontrivial to code it yourself. Here are some pre-existing open-source options for you:
To get to timing info and draw your custom overlay you can use (and modify) a tool like OCAT. It supports Direct3D 11, Direct3D 12, and Vulkan apps.
To just get the timing (and other interesting info) as CSV you can use a command-line tool like PresentMon. Should work in D3D, and I have been using it with Vulkan apps too and it seems to accept them.

iOS CGImageRef Pixel Shader

I am working on an image processing app for the iOS, and one of the various stages of my application is a vector based image posterization/color detection.
Now, I've written the code that can, per-pixel, determine the posterized color, but going through each and every pixel in an image, I imagine, would be quite difficult for the processor if the iOS. As such, I was wondering if it is possible to use the graphics processor instead;
I'd like to create a sort of "pixel shader" which uses OpenGL-ES, or some other rendering technology to process and posterize the image quickly. I have no idea where to start (I've written simple shaders for Unity3D, but never done the underlying programming for them).
Can anyone point me in the correct direction?

I'm going to come at this sideways and suggest you try out Brad Larson's GPUImage framework, which describes itself as "a BSD-licensed iOS library that lets you apply GPU-accelerated filters and other effects to images, live camera video, and movies". I haven't used it and assume you'll need to do some GL reading to add your own filtering but it'll handle so much of the boilerplate stuff and provides so many prepackaged filters that it's definitely worth looking into. It doesn't sound like you're otherwise particularly interested in OpenGL so there's no real reason to look into it.
I will add the sole consideration that under iOS 4 I found it often faster to do work on the CPU (using GCD to distribute it amongst cores) than on the GPU where I needed to be able to read the results back at the end for any sort of serial access. That's because OpenGL is generally designed so that you upload an image and then it converts it into whatever format it wants and if you want to read it back then it converts it back to the one format you expect to receive it in and copies it to where you want it. So what you save on the GPU you pay for because the GL driver has to shunt and rearrange memory. As of iOS 5 Apple have introduced a special mechanism that effectively gives you direct CPU access to OpenGL's texture store so that's probably not a concern any more.

Open Source way to real-time image processing OCR application? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have an application in mind that I want to produce. We have wall-mounted schedule boards that are divided into small rectangles using black lines on a white background. Magnetic name tags are placed into a particular partition to indicate this person is to work in that cell. This system works very well for communication among people, but I would like an automatic way of saving this schedule information into a database automatically.
I am envisioning a system where a camera is set in a fix position focusing on the schedule board. Periodically the camera will take a picture of the board. I want to write some code to decipher which name tags are in which area. This would require some OCR or symbol recognition. There are big numbers on each name tag that I will use to identify the person whose name tag it is.
I naturally go to Python when tackling a new programming problem. I found this post -> python image recognition which looks like a good place to start (with PIL and numpy).
Do you know a good way to do this?
Update: I have tried SimpleCV and it seems good for now.

This is actually a pretty hard problem, even though it looks quite simple. But you can make it a lot easier by doing some stuff to your image to make this manageable. I have the following suggestions:
Try to make it so that your camera is looking straight at the board with a reasonable lens so that there is minimal distortion of the image on the edges, and no perspective distortion.
Given that you'll be shooting the occasional image for analysis I think performance is in no way an issue, so shoot high-resolution images, with a flash or with a long exposure time (because everything you're shooting is stationary) to get the best possible picture quality.
If the number of different tags you expect is not too large you might find it easier to just try to match reference images of these tags in your image through template matching rather than going for full OCR of numbers. This is a lot easier to get working if your image is good enough. The python opencv interface is very complete.
High Performance Mark has a good comment to your question about including barcodes on the tags. I would add the option of QR codes, but that is just the same thing. Both are easy to detect and there are good libraries to help you read them.
If you decide you do need OCR, you should look into available OCR packages and not try to roll your own. Try pytesser for the tesseract engine or the OCRopus python interface.

Since you mentioned that you would like to use Python for this problem, perhaps you could take a look at SimpleCV. It will provides you an easy way to grab the image from the camera and do basic image processing.

I strongly agree with jilles de witt that OCR would be an extremely hard image analysis task to develop from scratch. Code reading would be a better option, but that also will be difficult to program and will require sophisticated or somewhat challenging imaging as others have noted. However, for this app you really do not need to implement OCR or formal bar codes, QR or other 2d codes.
Since your application is constrained to a limited number of targets, perhaps you could make your own simple code. For example, you could place 0 to 4 big dots in a 2x2 array after each person's name. This simple example code uniquely identifies 16 unique tags, and the features will be much easier to image, extract and decode than formal codes. Add a locator line if the code position is not consistent.

Automated Webcam Application / Hardware Problems

I am starting to develop an automated webcam application. The goal is to automatically take pictures, do some image processing and then upload the results to a FTP site. All of these tasks seem simple.
However, I am having a hard time to find a decent camera. I don't want to use a simple webcam or hd-webcam because the image quality of still frames isn't very good.
I'm also having a hard time finding an affordable digital camera supporting USB snapshot or control.
My second concern is the development itself. I'm not quite sure which programming language to use. I have experience with AS3, Processing, Java and some simple C++ and Open CV.
Do you have a clue?

Regarding the camera, There are pretty good webcams that you can find, some with HD quality. look at the cameras on Logitech (I tested their API and it is quite good), A HD camera has a retail of $99 which is very cheap. If you are looking for something better I would go with Nikon as they also have a pretty good API for C#/C++. You can get a basic SLR with simple 28mm lens for $500. Don't use a PowerShot as Nikon stopped supporting their API. Whatever camera you decide to buy make sure a proper API is available, is being maintained and free.
Regarding development, I would go with C#/Java as they are easier than C++. There are quite allot of libraries for image processing for C#/Java, just make sure that the Camera comes with an API the fits your chosen language.
Good luck.

Generally (from experience) most USB cameras that show up as an imaging device through Windows can be used with JAI [Java Advanced Imaging]. Additionally [on the .net/c++ side], the same cameras can be used through DirectShow as a capture device. Java/C# will make development easier but expect to loose some performance [even with the best of optimizations]. Additionally you can only perform upto the speed of the camera and the data line running from the camera to the computer [USB1.0 will seriously limit a decent framerate]

first get the image in RAM:
If you are using CHDK, I suggest you get the image copied from camera memory to RAM by using supported scripting languages by CHDK - you can take help from the CHDK forum http://chdk.setepontos.com/index.php for this.
or if thats difficult you can continuously copy the image to hard disk and load in RAM from there. (you need to take care (delete) of massive images accumulated on hard disk in a short period of time !)
This sounds like a 'brute force' approach, but will get your work going while you are researching correct approach.
perform image processing:
once the image is in RAM, you can apply your image processing algorithms as usual e.g. using opencv library.
hope this helps you

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart