I've a binary image where removing green dot gets me separate line segments. I've tried using label_components() function from Julia but it labels only verticall joined pixels as one label.
I'm using
using Images
img=load("current_img.jpg")
img[findall(img.==RGB(0.0,0.1,0.0))].=0 # this makes green pixels same as background, i.e. black
labels = label_components(img)
I'm expecteing all lines which are disjoint to be given a unique label
(as was a funciton in connected component labeling in matlab, but i can't find something similar in julia)
Since you updated the question and added more details to make it clear, I decided to post the answer. Note that this answer utilizes some of the functions that I wrote here; so, if you didn't find documentation for any of the following functions, I refer you to the previous answer. I operated on several examples and brought the results in the continue.
Let's begin with an image similar to the one you brought in the question and perform the entire operation from the scratch. for this, I drew the following:
I want to perform a segmentation process on it and labelize each segment and highlight the segments using the achieved labels.
Let's define the functions:
using Images
using ImageBinarization
function check_adjacent(
loc::CartesianIndex{2},
all_locs::Vector{CartesianIndex{2}}
)
conditions = [
loc - CartesianIndex(0,1) ∈ all_locs,
loc + CartesianIndex(0,1) ∈ all_locs,
loc - CartesianIndex(1,0) ∈ all_locs,
loc + CartesianIndex(1,0) ∈ all_locs,
loc - CartesianIndex(1,1) ∈ all_locs,
loc + CartesianIndex(1,1) ∈ all_locs,
loc - CartesianIndex(1,-1) ∈ all_locs,
loc + CartesianIndex(1,-1) ∈ all_locs
]
return sum(conditions)
end;
function find_the_contour_branches(img::BitMatrix)
img_matrix = convert(Array{Float64}, img)
not_black = findall(!=(0.0), img_matrix)
contours_branches = Vector{CartesianIndex{2}}()
for nb∈not_black
t = check_adjacent(nb, not_black)
(t==1 || t==3) && push!(contours_branches, nb)
end
return contours_branches
end;
"""
HighlightSegments(img::BitMatrix, labels::Matrix{Int64})
Highlight the segments of the image with random colors.
# Arguments
- `img::BitMatrix`: The image to be highlighted.
- `labels::Matrix{Int64}`: The labels of each segment.
# Returns
- `img_matrix::Matrix{RGB}`: A matrix of RGB values.
"""
function HighlightSegments(img::BitMatrix, labels::Matrix{Int64})
colors = [
# Create Random Colors for each label
RGB(rand(), rand(), rand()) for label in 1:maximum(labels)
]
img_matrix = convert(Matrix{RGB}, img)
for seg∈1:maximum(labels)
img_matrix[labels .== seg] .= colors[seg]
end
return img_matrix
end;
"""
find_labels(img_path::String)
Assign a label for each segment.
# Arguments
- `img_path::String`: The path of the image.
# Returns
- `thinned::BitMatrix`: BitMatrix of the thinned image.
- `labels::Matrix{Int64}`: A matrix that contains the labels of each segment.
- `highlighted::Matrix{RGB}`: A matrix of RGB values.
"""
function find_labels(img_path::String)
img::Matrix{RGB} = load(img_path)
gimg = Gray.(img)
bin::BitMatrix = binarize(gimg, UnimodalRosin()) .> 0.5
thinned = thinning(bin)
contours = find_the_contour_branches(thinned)
thinned[contours] .= 0
labels = label_components(thinned, trues(3,3))
highlighted = HighlightSegments(thinned, labels)
return thinned, labels, highlighted
end;
The main function in the above is find_labels which returns
The thinned matrix.
The labels of each segment.
The highlighted image (Matrix, actually).
First, I load the image, and binarize the Gray scaled image. Then, I perform the thinning operation on the binarized image. After that, I find the contours and the branches using the find_the_contour_branches function. Then, I turn the color of contours and branches to black in the thinned image; this gives me neat segments. After that, I labelize the segments using the label_components function. Finally, I highlight the segments using the HighlightSegments function for the sake of visualization (this is the bonus :)).
Let's try it on the image I drew above:
result = find_labels("nU3LE.png")
# you can get the labels Matrix using `result[2]`
# and the highlighted image using `result[3]`
# Also, it's possible to save the highlighted image using:
save("nU3LE_highlighted.png", result[3])
The result is as follows:
Also, I performed the same thing on another image:
julia> result = find_labels("circle.png")
julia> result[2]
14×16 Matrix{Int64}:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0
0 1 1 0 0 0 3 3 0 0 0 5 5 5 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
As you can see, the labels are pretty clear. Now let's see the results of performing the procedure in some examples in one glance:
Original Image
Labeled Image
Recently I study the image processing.
When I go through the problem of filling the hole, it confuses me (I assume that the people able to answer the question is familiar with the step of doing this so I skip to the problem):
Let's say if I have a binary image like this:
0 0 0 0 0 0 0
0 0 1 1 0 0 0
0 1 0 0 1 0 0
0 1 0 0 1 0 0
0 0 1 0 1 0 0
0 0 1 0 1 0 0
0 1 0 0 0 1 0
0 1 0 0 0 1 0
0 1 1 1 1 0 0
0 0 0 0 0 0 0
And the book says to start form the region that is inside of the hole and perform the dilation operation and set the bound in case it fills the whole image.
I have no problem understanding the whole process, but if I try to code it, how can I only deal with a specific region (in the hole for this case)? Or the actual implement would be different method ?
If you can assume that the object with holes does not touch the border of the image, you can create an intermediate image where you call flood fill (with value e.g. 2) on the top left pixel. Any remaining '0' pixels have to be inside the contour. Take the position of the first encountered remaining '0' pixel and flood fill it in the original image.
I'm looking at the example: https://github.com/fchollet/keras/blob/master/examples/conv_lstm.py
This RNN is actually predicting the next frame of the movie, so the output should be a movie too (according to the test data fed in). I wonder if there are information lost due to the conv layers with padding.
For example, the underlying Tensorflow is padding bottom right, if there is a big padding: (n stands for numbers)
n n n n 0 0 0
n n n n 0 0 0
n n n n 0 0 0
n n n n 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
when we do the second conv, the bottom right corner will always be 0, which means the back propagation will never be able to capture anything there. As in this case a movie(a square moves on the whole screen), will it lost the information when the validation label is on the bottom right corner?
The answer is yes after asking a Ph.D. doing AI research.
I am trying to do image compression using DCT (Discrete Cosine Transform). Can someone please help me understand how masking affects bit per pixel in DCT? How is the bit allocation done in the masking?
PS: By masking, I mean multiplying the DCT coefficients with a matrix like the one below (element wise multiplication, not matrix multiplication).
mask = [1 1 1 1 0 0 0 0
1 1 1 0 0 0 0 0
1 1 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0]
Background on "Masking"
Compression using the DCT calculates the DCT of blocks of an image, in this case 8x8 pixels. High frequency components of an image are less important for the human perception of an image and thus can be discarded in order to save space.
The mask Matrix selects which DCT coefficients are kept and which ones get discarded in order to save space. Coefficients towards the top left corner represent low frequencies.
For more information visit Discrete Cosine Transform.
This looks like a variation of quantization matrix.
Low frequencies are in top left, high frequencies are in bottom right. Eye is more sensitive to low frequencies, so removal of high frequency coefficients will remove less important details of the image.
Lets say I create a matrix M1 of 5 rows and 1 column of 8UC3 type to store RGB components of an image.Then I create another matrix M2 of 5 rows and 3 columns of 8UC1 type to again store the RGB components of the image.
Is there a difference in the way these 2 types of matrices are stored in/accessed from the memory? From what I understand from http://www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/opencv-intro.html#SECTION00053000000000000000 (commonly recommended OpenCV tutorial on Stackoverflow), the data pointer of the matrix points to the first index of the data array(the matrix is internally stored as an array) and the various RGB components are stored in an interwoven fashion(in case of 8UC3).
My logic says that they should be the same as in case of 1 column 8UC3(M1), for each column RGB components are stored, and in the case of 3 columns 8UC1(M2), each column stores the RGB component.
I hope I have been able to formulate my question well.
Thanks in advance!
Your understanding is correct. The memory layout will be exactly the same. So you can cheaply convert the representation back-and-forth via reshape method.
The thing that would be different is how OpenCV algorithms will handle those matrices.
Let's say the memory footprint is as follow:
255 0 0
255 0 0
255 0 0
255 0 0
255 0 0
And you want to call the resize function to add 3 columns. Then in the case of a 5x1 Mat of CV_8UC3, the result will be
255 0 0 255 0 0
255 0 0 255 0 0
255 0 0 255 0 0
255 0 0 255 0 0
255 0 0 255 0 0
And in case of a 5x3 Mat of CV_8UC1, the result will be
255 255 0 0 0 0
255 255 0 0 0 0
255 255 0 0 0 0
255 255 0 0 0 0
255 255 0 0 0 0