Foreground-extraction
I am extracting a person from its background and I am using cv2.grabcut for that. But sometimes the background pixels are misclassified as foreground hence the extraction is not perfect. I have attached the resultant image. How to improve this extraction?
To improve the extraction you need to play with iterCount and mode parameters.
For instance:
I have the following image:
If I apply the example code:
Can I improve by changing the iterCount?
iterCount=10, 20 (respectively)
iterCount = 30, 40 (respectively)
Can I improve by changing the modes?
mode = GC_INIT_WITH_RECT, GC_INIT_WITH_MASK (respectively)
In my case GC_INIT_WITH_MASK works good, but I said you need to change parameters until the satisfactory result comes out.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I would like to read a movie file frame by frame, run some image processing algorithms on each frame, and display the resultant frames sequentially in an iOS app. There are mechanisms for doing this with the live camera feed, such as using an AVCaptureSession, and I would like to do something similar with movies already saved to disk.
I am trying to do this with an AVAssetReader and AVAssetReaderTrackOutput, however in this document, it clearly states that "AVAssetReader is not intended for use with real-time sources, and its performance is not guaranteed for real-time operations".
So my question is: what is the right way to read movie frames in real-time?
Edit:
To further clarify what I mean by "real-time", consider the fact that it possible to capture frames from the camera feed, run some computer vision algorithms on each frame (i.e. object detection, filtering, etc), and display the filtered frames at a reasonable frame rate (30 - 60 FPS). I would like to do the exact same thing, except that the input source is a video already saved on disk.
I don't want to read the entire movie, process the whole thing, and display the result only once the entire processing pipeline is finished (I would consider that non-real time). I want the processing to be done frame by frame, and in order to do that the file has to be read in frame by frame in real time.
In order to playback and process a video in real-time you can use the AVPlayer class. The simplest way to live-process video frames is through a custom video composition on the AVPlayerItem.
You might want to check out this sample project from Apple where they highlight HDR parts in a video using Core Image filters. It shows the whole setup required for real-time processing and playback.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am using open CV 3.4. I am getting feed from an RTSP input camera. I want to add a condition in my code such that if camera is covered using any thing Alert should go to user. Checking blackness of the frame doesn't do any justice because when covered with white color cloth, the frame will be white. Can anyone suggest some logic for this? How can we accomplish this using openCV?
You can check whether the camera is in focus or not. For example, here's a blurry photo of my palm and of my window:
Here's the function that calculates a sharpness "score" of each image:
def sharpness(img):
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
lap = cv.Laplacian(img, cv.CV_16S)
mean, stddev = cv.meanStdDev(lap)
return stddev[0,0]
Testing:
The blurry picture has a much lower score. You can set the threshold to e.g. 20 and anything below that is considered blurry and therefore the camera is covered or something else is wrong with it.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
EDIT
below image is the pre-processed sequence on original image.
1. Original Image -> 2. Blur x n times to make qrcode position significant -> 3. crop original image, position extracted from second step using blob -> 4. sharpen and threshold -> 5. check three square for qrcode -> 6. to do additional transformation like rotation -> (final image) (cropped image with resize resolution.)
Old Question
I am trying to reconstruct qrcode from original image. As you can see the photo has damaged qrcode, so I use Aforge library to detect 3 square from image using blob. Now what I don't understand is the logic to generate qrcode from this information. Is it technically possible to reconstruct qrcode with given information?
This is an interesting problem. To answer your question, is this technically possible. Yes it is certainly possible. The QR code in your question encode "5176941.12".
Here's the prepossessed image so that it's easier to manually set the pixels.
After this step, I use excel to set each pixel one by one. After that simply point your phone towards the computer screen. This is what it looks like. If you want the excel sheet, you can get it here.
Now that the question of possibility is out of the way, how to automate it? Without knowing further additional samples, it is difficult to say for sure. However, just based on this sample alone, the simplest approach is simply align a 21x21 grid over your cropped QR image and fill in the values by using a threshold. And then pass this image to your QR decoder. QR code has certain level of redundancy so even if some of the pixels are missing, you will most likely be able to recover the original data.
Edit
Here's some code in python which may serve as a guide to how you might automate this. A few things to note:
I bypass the step of detecting the 3 boxes and manually crop it very tightly. If there are rotations during the capture, you need to fix it.
The threshold 0.6 needs adjusting for different images. Right now, it 'luckily' works even though there are multiple errors. It can be that you might never have a valid qr code if the errors are too extensive.
Code:
import cv2
import numpy as np
def fill3box(qr):
qr[0:7,0:7] = 1
qr[14:21,14:21] = 1
qr[14:21,0:7] = 1
qr[0,0:6]=0
qr[0:6,0]=0
qr[0:6,6]=0
qr[6,0:7]=0
qr[2:5,2:5]=0
qr[14:21,14:21] = qr[0:7,0:7]
qr[14:21,0:7] = qr[0:7,0:7]
return qr
im = cv2.imread('to_process.png')
im = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
im = cv2.resize(im,(210,210))
im = 1-((im - im.min())/(im.max()-im.min())) #normalize and adjust contrast
avg=np.average(im)
qr = np.ones((21,21))
w,h = im.shape[:2]
im_orig = im.copy()
im[im<avg]=0#binarize
im[im>avg]=1
for y in range(21):
for x in range(21):
x1,y1 = (round(x*w/21),round(y*h/21))
x2,y2 = (round(x1+10),round(y1+10))
im_box = im[y1:y2,x1:x2]
if np.average(im_box)<0.6 and qr[y,x]!=0:#0.6 need tweaking
qr[y,x]=0
qr = fill3box(qr) #clean up 3 box areas as they need to be fixed
# debug visualization
for x in range(21):
p1 = (round(x*w/21),0)
p2 = (round(x*w/21),h)
cv2.line(im_orig,p1,p2,(255),1)
for y in range(21):
p1 = (0,round(y*h/21))
p2 = (w,round(y*h/21))
cv2.line(im_orig,p1,p2,(255),1)
qr = cv2.resize(qr,(210,210),interpolation=cv2.INTER_NEAREST)
im = (im*255).astype(np.uint8)
qr= (qr*255).astype(np.uint8)
im_orig= (im_orig*255).astype(np.uint8)
cv2.imwrite('im.png',im)
cv2.imwrite('qr.png',qr)
cv2.imwrite('im_orig.png',im_orig)
Cropped image to_process.png in code.
Grid overlayed to show how this method works
Thresholded image.
Regenerated QR, note that it still works even though there are multiple errors.
This will be difficult.
If you can decode this QR using a reader (I tried but failed), it is possible to re-code it using a writer. But there is no guarantee that the writer will recreate the same, as different encoding options are possible.
If your goal is in fact to be able to decode, you are stuck. Decoding "by hand" might be possible but is lengthy and complicated. You can also consider redrawing the code by hand on a perfect grid, and pass this to a reader.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
In the Image above (Top Image), suppose the black boundary is the phone.
What i was trying to achieve is to randomly generate the red path from the top of the screen and at the same time the red line (path) moves downwards.
Notice how the red path is random and does not have a uniform shape.
My question is how do i achieve this?
I know this has something to do with the random function.
But then generating the random path has been my main obstacle since 8 hours.
I could not generate a shape at every interval of the timer with a specific x coordinate and y coordinate but then as you can see in the next image, how would i generate the line at an angle (rotated)
Have tried hard to search everywhere on the internet but failed.
I always keep stackoverflow my last destination after I fail to achieve any functionality after numerous hours.
Would really appreciate if anyone could help me out with this.
It looks like you could achieve the effect you wish by starting at the top center, and repeatedly choosing 2 random numbers: how far down to go, and how far horizontally to go (positive or negative), until you got to the bottom of the screen. You'd have to be careful not to go off either edge, or you could instead choose a random x-coordinate each step.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I want to display audio meters on the iPad consisting of many small green, red or black rectangles. They don't need to be fancy but there may be a lot of them. I am looking for the best technique to draw them quickly. Which of the following techniques is better: text atlas in CALayers or OpenGLES or another?
Thank you for your answers before the the question was closed for being too broad. Unfortunately I couldn't make the question narrow because I didn't know which technology to use. If I had known the answer I could have made the question very narrow.
The fastest drawing would be to use OpenGLES in a custom view.
An alternative method would be to use a texture atlas in CALayers. You could draw 9 sets of your boxes into a single image to start with (0-8 boxes on), and then create the 300 CALayers on screen all using that as their content. During each frame, you switch each layer to point at the part of the texture atlas it needs to use. I've never done this with 300 layers before, so I don't know if that may become a problem - I've only done it with a half dozen or so digits that were updating every frame, but that worked really well. See this blog post for more info:
http://supermegaultragroovy.com/2012/11/19/pragma-mark-calayer-texture-atlases/
The best way to draw something repeatedly is to avoid drawing it if it is already on the screen. Since audio meters tend to update frequently, but most of their area stay the same, because audio signals are relatively smooth, you should track what's drawn, and draw only the differences.
For example, if you have drawn a signal meter with fifty green squares in a previous update, and now you need to draw forty eight green squares, you should redraw only the two squares that are different from the previous update. This should save you a lot of quartz calls.
Postpone rendering to the point where it's absolutely necessary, i. e. assuming you're drawing with CoreGraphics, use paths, and only stroke/fill the path when you have added all the rectangles to it.