I am trying to create mosaic images of simulation data where each tile is in .jpg format and has a fixed number of pixels. I combine hundreds of these into a larger image for easier parameter analysis. So far I was able to parallelize getting the tiles to create a larger image with the following code:
using Distributed
addprocs(8)
# add #everywhere macro before every function and variable so it magically works
#everywhere big_image = zeros(RGB, 300, 1000); #each tile has 100x100 pixels for simplicity
function createBigFrame(big_image)
#sync for row = 1:10 # #sync waits until all the images are fetched and then continue with plotting
for col = 1:3
i = #range for x
j = #range for y (or vice versa)
image_path = #get the path
Threads.#spawn big_image[i, j] = load(image_path);
end
end
plot(...) # add axes and ticks to the image
savefig(...) # save the figure on the disk
end
Although I used this on small data, it gave me a 20% increase in performance. Higher performance will be seen with larger data since there are more tile images to parallel. However, it has been told me that this is not the proper way of parallelizing things. I am very curious to know the right way to load images in parallel (not concurrently since the load() function is not thread-safe) and improve the code and performance further. I am very grateful for your help.
EDIT: The following code is supposed to be a minimal working example, but since you don't have the files to load(), the situation is a little different.
using Distributed
addprocs(4) #do not re-run
#everywhere using Colors, Images
#everywhere img = Images.zeros(RGB, 300, 1000);
function createBigFrame(image)
#sync for row = 1:10
for col = 1:3
j = 100*(col-1)+1:100*col;
i = 100*(row-1)+1:100*row;
Threads.#spawn image[j, i] .= RGB(rand(3)...)
end
end
return image
end
createBigFrame(img)
I managed to run my code as the second suggestion of the selected answer in here. However, I'm not sure I fully understand how the code works under the hood.
I'll put this example picture here so we are on the same page:
There are 3 rows and 10 columns of tiles in this image.
Now, in my first attempt, I used the following way to get the images to row 1 in parallel.
arr = SharedArray{Float32,3}(ones(3, 512*3, 512*10))
#sync #distributed for row = 1:3
for col = 1:10
row_index = #index range for the height of the image
col_index = #index range for the width of the image
arr[:, row_index, col_index] = channelview(testimage("lake_color"))
end
end
If my understanding is correct, #sync is used to wait for each row iteration to complete while workers fetch images in that row in parallel. I thought if this is the case, there should be a waste of time between rows waiting for the previous row to complete.
So I flattened the 2D fetching loop to 1D in order to have more space to move and fetch the images freely:
#sync #distributed for s = 1:30
row_index = #index range for the height of the image
col_index = #index range for the width of the image
arr[:, row_index, col_index] = channelview(testimage("lake_color"))
end
But I was wrong? The second code came out 25% slower than the first one (150 seconds to 190 seconds), so I wonder what is going on here?
Related
I am trying to read a handwritten form which has boxed-input.
I have run tesseract on the image but get strange results. In my understanding, I suppose the best thing to do is to detect the bounding box and minus that from the image. What's the best way to detect the box (semi-box around the character). I tried cv2.HoughLines(), but with no result.
I am new to OpenCV. It will be really helpful if someone can help me out here.
Thanks for your idea. I just realized probably i can look at counting the vertical pixels and greater than certain threshold
def get_pixel_count_in_col(img,col):
count=0
for j in range(img.shape[0]):
if(img[j,col]<255):
count=count+1
return count
def cleanup_img(img):
foundlines=[]
for i in range(img.shape[1]):
if(get_pixel_count_in_col(img,i)>img.shape[0]*0.7):
foundlines.append(i)
if(get_pixel_count_in_col(img,i-1)>img.shape[0]*0.25):
foundlines.append(i-1)
if(get_pixel_count_in_col(img,i+1)>img.shape[0]*0.25):
foundlines.append(i+1)
return np.delete(img,foundlines,1)
The resulting image makes more sense. But is there any other easy way to do this ?
It seems that your input format is quite clean and consistent. You can simply hard-code the width of each box in pixels and crop out the characters. However if the input format is not fixed then we can extend this answer to handle that as well(it would be bit expensive), so as the first attempt we would simply go with hard coding the width of boxes in pixels.
def get_image_chunks(img, size):
chunks = []
# To remove black borders
padding = 2
for i in xrange(0, img.shape[1], size):
col_start = i + padding
col_end = i + size - padding
# Slicing the numpy array.
chunks.append(img[:-padding, col_start:col_end])
return chunks
img = cv2.imread("/Users/anmoluppal/Downloads/GLUmJ.jpg", 0)
chunks = get_image_chunks(img, 42)
Outputs:
;
;
I have a problem I just cannot solve, and after a week it's really winding me up.
Background.
I'm placing items onto a circle using basic trig. The number of items can change dynamically, and they are spaced around the circle equally.
The items rotate around the circle, and the speed of rotation changes to be in sync with a BPM (Beats Per Minute) clock. This clock can change at any time.
The problem I'm having is that the items seem to be placed randomly on the circle, not equally spaced in order (see image 1). They'll appear out of order even though it's a basic for loop that places them. I think they may in face be in order, but the rotation values may be off making them look like they are in an odd order.
The second issue is that when the number of items reduces, the speed of rotation increases (it shouldn't) and if the number increases, the speed slows.
So I expect an issue with my trig function. I'm showing the complete code here but can simplify if it'll help.
What have I tried?
I've tried simplified versions without the graphical output, and the numbers all seem to make perfect sense. The radians between items is correct, the placement looks correct. It all looks correct, but it isn't.
The code.
--the variables
orbitalCircle.xPos = x or 0
orbitalCircle.yPos = y or 0
orbitalCircle.circleDiameter = diameter or 10
orbitalCircle.numberOfNotes = number_of_notes
orbitalCircle.spaceBetweenNotes = (360 / number_of_notes)
orbitalCircle.beatsPerSecond = (beats_per_minute / 60)
orbitalCircle.currentRotation = 0
orbitalCircle.framesPerSecond = frames_per_second or 15
orbitalCircle.framesPerFullRotation = (orbitalCircle.numberOfNotes/orbitalCircle.beatsPerSecond)+orbitalCircle.framesPerSecond
orbitalCircle.degreesPerFrame = 360 / orbitalCircle.framesPerFullRotation
orbitalCircle.newRotationValue = orbitalCircle.currentRotation + orbitalCircle.degreesPerFrame
orbitalCircle.sequenceData = sequence_data
--the function that updates the sequence data and therefore the number of items on the circle
function orbitalCircle.updateNotes(sq)
orbitalCircle.sequenceData = sq
orbitalCircle.numberOfNotes = (#sq)
orbitalCircle.spaceBetweenNotes = (360 / orbitalCircle.numberOfNotes)
end
--the function that calculates the new rotation value of the item to be placed on the circle
function orbitalCircle.tick()
orbitalCircle.spaceBetweenNotes = (360 / number_of_notes)
orbitalCircle.framesPerFullRotation = (orbitalCircle.numberOfNotes/orbitalCircle.beatsPerSecond)*orbitalCircle.framesPerSecond
orbitalCircle.degreesPerFrame = (360 / orbitalCircle.framesPerFullRotation)
orbitalCircle.newRotationValue = (orbitalCircle.currentRotation + orbitalCircle.degreesPerFrame)
if orbitalCircle.newRotationValue > 360 then
orbitalCircle.currentRotation = 0
else
orbitalCircle.currentRotation = orbitalCircle.newRotationValue
end
end
--finally the function that places the items onto the circle
function orbitalCircle.redraw()
screen.circle(orbitalCircle.xPos, orbitalCircle.yPos, orbitalCircle.circleDiameter)
screen.stroke()
for i=1, (#orbitalCircle.sequenceData) do
if orbitalCircle.sequenceData[i] > 0 then
screen.circle(
math.cos(math.rad(orbitalCircle.newRotationValue)+(orbitalCircle.spaceBetweenNotes*i))*orbitalCircle.circleDiameter + orbitalCircle.xPos,
math.sin(math.rad(orbitalCircle.newRotationValue)+(orbitalCircle.spaceBetweenNotes*i))*orbitalCircle.circleDiameter + orbitalCircle.yPos,
map(orbitalCircle.sequenceData[i], 5, 128, 0.5, 4)
)
end
end
end
end
I'd expect that the items would be:
equally spaced no matter the amount (that works)
in order (they appear not to be)
the speed of rotation should remain fixed unless the BPM changes (this doesn't happen)
I'm lost!
Let us take a closer look at the drawing.
screen.circle(
math.cos(math.rad(orbitalCircle.newRotationValue)+(orbitalCircle.spaceBetweenNotes*i))*orbitalCircle.circleDiameter + orbitalCircle.xPos,
math.sin(math.rad(orbitalCircle.newRotationValue)+(orbitalCircle.spaceBetweenNotes*i))*orbitalCircle.circleDiameter + orbitalCircle.yPos,
map(orbitalCircle.sequenceData[i], 5, 128, 0.5, 4)
)
What is the angle that is being drawn here? It is the argument to math.cos and math.sin (I will ignore the scaling and the translation that is applied afterwards):
math.rad(orbitalCircle.newRotationValue)+(orbitalCircle.spaceBetweenNotes*i)
So... it is the neRotationValue converted to radians and added to that the space between notes. This one is defined as 360 / number_of_notes, so it is in degrees. Adding a radians and degrees most likely does not produce the expected result.
So, what exactly do you mean with the following?
I've tried simplified versions without the graphical output, and the numbers all seem to make perfect sense.
I have an image which is a result of k-means segmentation. The code to obtain it it's here:
% Read the image and convert to L*a*b* color space
I = imread('Crop.jpg');
% h = ginput(2);
% Diameter = sqrt((h(2)-h(1))^2+(h(4)-h(3))^2);
% MeanArea = 3.14*(diameter^2)/4;
Ilab = rgb2lab(I);
% Extract a* and b* channels and reshape
ab = double(Ilab(:,:,2:3));
nrows = size(ab,1);
ncols = size(ab,2);
ab = reshape(ab,nrows*ncols,2);
% Segmentation usign k-means
nColors = 4;
[cluster_idx, cluster_center] = kmeans(ab,nColors,...
'distance', 'sqEuclidean', ...
'Replicates', 3);
% Show the result
pixel_labels = reshape(cluster_idx,nrows,ncols);
figure(1);
imshow(pixel_labels,[]), title('image labeled by cluster index');
Resulting in this picture:
Now as you can see, most of the elements are connected, so I want to count all of the blobs (besides the background one), then filter them using MeanArea, area of the elements incircle. If the blob has dimensions < MeanArea I do not count it, while if the blob has dimensions > MeanArea I want to divide its area by MeanArea to obtain the number of elements. All of this is to have a measure such that #blobs = #elements. I know that it has something to do with 'bwlabel' and 'regionprops' but I don't know how to code this since I'm a beginner, any coding help is appreciated. Thanks.
EDIT: using the 'trees' approach linked in the comments I got very bad results, so I don't think it's the right method. I don't have objects with same color as the tree example, I just have same shape.
I'm following this other approach. Color segmentation by k-means
I obtained the labeled image above, but how can I save it into a variable so that I can erode it and count the number of blobs? That's my question.
EDIT2: The original picture is this one. I'm trying to detect the number of red green and blue objects.
I'm writing a scene in SceneKit for iOS.
I'm trying to apply a texture to an object using a sprite sheet. I iterate through the images in that sheet with this code:
happyMaterial = [SCNMaterial new];
happyMaterial.diffuse.contents = happyImage;
happyMaterial.diffuse.wrapS = SCNWrapModeRepeat;
happyMaterial.diffuse.wrapT = SCNWrapModeRepeat;
happyMaterial.shaderModifiers = #{ SCNShaderModifierEntryPointGeometry : #"_geometry.texcoords[0] = vec2((_geometry.texcoords[0].x+floor(u_time*30.0))/10.0, (_geometry.texcoords[0].y+floor(u_time*30.0/10.0))/7.0);" };
All is good. Except over time, the texture starts to get random jitteriness in it, especially along the x-axis.
Someone mentioned it could be because of "floating-point precision issues," but I'm not sure how to diagnose or fix this.
Also: I'm not sure how to log data from the shader code. Would be awesome to be able to look into variables like "u_time" and see exactly what's going on.
It's definitely a floating point precision issue. you should probably try to do a modulo on (u_time*30.0) so that it loops within a reasonable range.
if you want to iterate over images your texture coordinate must stay the same for a short period of time (1 second for instance).
u_time is similar to CACurrentMediaTime(), it's a time in seconds.
Now let's say you have N textures. Then mod(u_time, N) will increase every second from 0 to N-1 and then go back to 0. If you divide this by N you've got your texture coordinate, and you don't need SCNWrapModeRepeat.
If you want your image to change every 0.04 second (25 times per second), then use mod(25 * u_time, N) / N.
I want to create a script to automatically click on a moving target on a game.
To do this I want to check colours on the screen to determine where my target is and then click him, repeating this if he moves.
I'm not skilled at programming and there might be an easier solution to what I have proposed below:
1/Split the screen into equal tiles - size of tile should represent the in game object.
2/Loop through each pixel of each tile and create a histogram of the pixel colours.
3/If the most common recorded colour matches what we need, we MIGHT have the correct tile. Save the coords and click the object to complete task
4/Every 0.5 seconds check colour to determine if the object has moved, if it hasnt, keep clicking, if it has repeat steps 1, 2 and 3.
The step I am unsure of how to do technically is step 1. What data structure would I need for a tile? Would a 2D array suffice? Store the value of each colour in this array and then determine if it is the object. Also in pseudo how would I split the screen up into tiles to be searched? The tile issue is my main problem.
EDIT for rayryeng 2:
I will be using Python for this task. This is not my game, I just want to create a macro to automatically perform a task for me in the game. I have no code yet, I am looking more for the ideas behind making this work than actual code.
3rd edit and final code:
#!/usr/bin/python
import win32gui, win32api
#function to take in coords and return colour
def colour_return(x,y):
colours = win32gui.GetPixel(win32gui.GetDC(win32gui.GetActiveWindow()), x,y)
return colours
def click(x,y):
win32api.SetCursorPos((x,y))
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
#variable declaration
x = 1
y = 1
pixel_value = []
colour_found = 0
while x < 1600:
pixel_value = colour_return(x,y)
if pixel_value == 1844766:
click(x,y)
x=x+1
#print x
print y
if x == 1600:
y=y+1
x=1
#print tile
pixel_value = 0
This is the final code that I have produced. It works but it is incredibly slow. It takes 30 seconds seconds to search all 1600 pixels of y=1. I guess it is my method that is not working. Instead of using histograms and tiles I am now just searching for a colour and clicking the coordinates when it matches. What is the fastest method to use when searching an entire screen for a certain colour? I've seen colour detection bots that manage to keep up every second with a moving character.