I'm trying to make a network that identifies missing atoms from an image, and is then able to count them.
So far, I have created a CNN that is able to output an image like below that solely highlights such atoms 1, and I have found an OpenCV library tat I think would be able to count all the individual white spots (https://www.geeksforgeeks.org/white-and-black-dot-detection-using-opencv-python/).
However I would like to be able to count the missing atoms for a given structre: eg I've circled in red the clusters of dots that I would like my network to identify and count as a structure with 8 atoms for example. I have thought the easiest method might be to:
Apply bounding boxes based on segmentation, crop the image and count the atoms per box,which would provide a labelled image, however a big issue would be the inclusion of additional, non-connected atoms in the box
If this approach was taken, how would I be able to create more specific-shaped boxes? I could try coloring each circle based on the area, and ignoring the other color when cropped to the bounding box size?
Personally I think the question I'm struggling with is how do I segment this image and process each cluster, or is there a smoother technique that I could use or integrate into my ML model?
Related
I am working on image registration between LWIR & RGB images. I am able to extract the edges from both images.
RGB_Edges, LWIR_Edges
Now, I want to match the edges of these images to calculate homography.
I tried to match each edge of RGB with LWIR image separately using template matching (OpenCV) but it didn't worked.
Therefore, can anyone please suggest some methods to mach the edges/structures from both images that can be helpful to compute homography?
I will really appreciate any suggestion/help.
Thanks.
These two images are already fairly well aligned.
Due to the large thickness and irregularity of the edges, I doubt you can do much better.
If you have the option of operator supervision, point at corresponding points in the two images (four pairs are enough for an homography).
For an automated approach, you can try to thin the strokes then to find (approximate) line segments in both images. For a certain number of segments in one image, find the segment which is (approximately) parallel, close and facing with a significant overlap in the other. You can expect that these segments are in correspondence.
Next, you can you can obtain corresponding points by forming the intersections between some segments in each image (take segments that are close but as perpendicular as possible).
As this procedure will suffer from outliers, model fitting by RANSAC is probably a good option.
Our core aim is:
to use Image Processing to read/scan an architectural Floor Plan Image (exported from a CAD software)
to use Image Processing to read/scan an architectural Floor Plan Image (exported from a CAD software) extract the various lines and curves, group them into Structural Entities like walls, columns, beams etc. – ‘Wall_01’, ‘Beam_03’ and so on
extract the dimensions of each of these Entities based on the scale and the length of the lines in the Floor Plan Image (since AutoCAD lines are dimensionally accurate as per the specified Scale)
and associate each of these Structural Entities (and their dimensions) with a ‘Room’.
We have flexibility in that we can define the exact shapes of the different Structural Entities in the Floor Plan Image (rectangles for doors, rectangles with hatch lines for windows etc.) and export them into a set of images for each Structural Entity (e.g. one image for walls, one for columns, one for doors etc.).
For point ‘B’ above, our current approach based on OpenCV is as follows:
Export each Structural Entity into its own image
Use Canny and HoughLine Transform to identify lines within the image
Group these lines into individual Structural Elements (like ‘Wall_01’)
We have managed to detect/identify the line segments using Canny+HoughLine Transform with a reasonable amount of accuracy.
Original Floor Plan Image
Individual ‘Walls’ Image:
Line Segments identified using Canny+HoughLine:
(I don't have enough reputation to post images yet)
So the current question is - what is the best way to group these lines together into a logical Structural Entity like ‘Wall_01’?
Moreover, are there any specific OpenCV based techniques that can help us group the line segments into logical Entities? Are we approaching the problem correctly? Is there a better way to solve the problem?
Update:
Adding another image of valid wall input image.
You mention "exported from a CAD software". If the export format is PDF, it contains vector data for all graphic elements. You might be better off trying to extract and interpret that. Seems a bit cumbersome to go from a vector format to a pixel format which you then try to bring back to a numerical model.
If you have clearly defined constraints as to what your walls, doors, etc will look like in your image, you would use exactly those. If you are generating the CAD exports yourself, modify the settings there so as to facilitate this
For instance, the doors are all brown and are closed figures.
Same for grouping the walls. In the figures, it looks like you can group based on proximity (i.e, anything within X pixels of each other is one group). Although, the walls to the right of the text 'C7' and below it may get grouped into one.
If you do not have clear definitions, you may be looking at some generic image recognition problems, which means A.I or Machine Learning. This would require a large variety of inputs for it to learn from, and may get very complex
This might be a very broad question so I'm sorry in advance. I'd like to also point out I'm new in the CV field, so my insight in this field is minimum.
I am trying to find correspondences between points from a FLIR image and a VIS image. I'm currently building 40x40 pixels regions around keypoints, over which I'm applying the LoG. I'm trying to compare them to find the most similar regions.
For example, I have these data sets:
Where the columns represent, in this order:
the image for which I'm trying to find a correspondent
the candidate images
the LoG of the first column
the LoG of the second column
It is very clear, for the human eye, that the third image is the best match for the first set, while the first image is the best image for the second set.
I have tried various ways of expressing a similarity/disimilarity between these images, such as SSD, Cross Correlation, or Mutual Information, but they all fail to be consistent (they only work in some cases).
Now, my actual question is:
What should I use to express the similarity between images in a more semantic way, such that shapes would be more important in deciding the best match, rather than actual intensities of the pixels? Do you know of any technique that would aid me in my quest of finding these matches?
Thank you!
Note: I'm using OpenCV with Python right now, but the programming language and library is not important.
I am looking for an algorithm or, even better, some library that covers background substraction from a single static image (no background model available). What whould be possible though is some kind of user input like for example https://clippingmagic.com does it.
Sadly my google fu is bad here as i cant find any papers on that topic with my limited amount of keywords.
That webpage is really impressive. If I were to try and implement something similar I would probably use k-means clustering using the CIELAB colorspace. The reason for changing the colorspace is so that colors can be represented by two points rather than 3 as a regular RGB image. This should speed up clustering. Additionally, the CIELAB color space was build for this purpose, finding "distances" (similarities) between colors and accounts for the way humans perceive color. Not just looking at the raw binary data the computer has.
But a quick overview of kmeans. For this example we will say k=2 (meaning only two clusters)
Initialize each cluster with a mean.
Go through every pixel in your image and decide which mean it is closer to, cluster 1 or 2?
Compute the new mean for your clusters after you've processed all the pixels
using the newly computed means repeat steps 2-4 until convergence (meaning the means don't change very much)
Now that would work well when the foreground image is notably different than the background. Say a red ball in a blue background, but if the colors are similar it would be more problematic. I would still stick to kmeans but have a larger number of clusters. So on that web page you can make multiple red or green selections. I would make each of these strokes a cluster, and intialize my cluster to the mean. So say I drew 3 red strokes, and 2 green ones. That means I'd have 5 groups. But somehow internally I add an extra attribute as foreground/background. So that each cluster will have a small variance, but in the end, I would only display that attribute, foreground or background. I hope that made sense.
Maybe now you have some search terms to start off with. There may be many other methods but this is the first I thought of, good luck.
EDIT
After playing with the website a bit more I see it uses spatial proximity to cluster. So say I had 2 identical red blobs on opposite sides of the image. If I only annotate the left side of the image the blob on the right side might not get detected. Kmeans wouldn't replicate this behavior since the method I described only uses the color to cluster pixels completely oblivious to their location in the image.
I don't know what tools you have at your disposal but here is a nice matlab example/tutorial on color based kmeans
I'm trying to detect objects and text in a hand-drawn diagram.
My goal is to be able to "parse" something like this into an object structure for further processing.
My first aim is to detect text, lines and boxes (arrows etc... are not important (for now ;))
I can do Dilatation, Erosion, Otsu thresholding, Invert etc and easily get to something like this
What I need some guidance for are the next steps.
I've have several ideas:
Contour Analysis
OCR using UNIPEN
Edge detection
Contour Analysis
I've been reading about "Contour Analysis for Image Recognition in C#" on CodeProject which could be a great way to recognize boxes etc. but my issue is that the boxes are connected and therefore do not form separate objects to match with a template.
Therefore I need some advises IF this is a feasible way to go.
OCR using UNIPEN
I would like to use UNIPEN (see "Large pattern recognition system using multi neural networks" on CodeProject) to recognize handwritten letters and then "remove" them from the image leaving only the boxes and lines.
Edge detection
Another way could be to detect all lines and corners and in that way infer the boxes and lines that the image consist of. In that case ideas on how to straighten the lines and find the 90 degree corners would be helpful.
Generally, I think I just need some pointers on which strategy to apply, not code samples (though it would be great ;))
I will try to answer about the contour analysis and the lines between them.
If you need to turn the interconnected boxes into separate objects, that can be achieved easily enough:
close the gaps in the box edges with morphological closing
perform connected components labeling and look for compact objects (e.g. objects whose area is close to the area of their bounding box)
You will get the insides of the boxes. These can be elliptical or rectangular or any shape you may find in common diagrams, the contour analysis can tell you which. A problem may arise for enclosed background areas (e.g. the space between the ABC links in your example diagram). You might eliminate these on the criterion that their bounding box overlaps with multiple other objects' bounding boxes.
Now find line segments with HoughLinesP. If a segment finishes or starts within a certain distance of the edge of one of the objects, you can assume it is connected to that object.
As an added touch you could try to detect arrow ends on either side by checking the width profile of the line segments in a neighbourhood of their endpoints.
It is an interesting problem, I will try to remember it and give it to my students to grit their teeth on.