colour detection bot - image-processing

colour detection bot - image-processing

I want to create a script to automatically click on a moving target on a game.
To do this I want to check colours on the screen to determine where my target is and then click him, repeating this if he moves.
I'm not skilled at programming and there might be an easier solution to what I have proposed below:
1/Split the screen into equal tiles - size of tile should represent the in game object.
2/Loop through each pixel of each tile and create a histogram of the pixel colours.
3/If the most common recorded colour matches what we need, we MIGHT have the correct tile. Save the coords and click the object to complete task
4/Every 0.5 seconds check colour to determine if the object has moved, if it hasnt, keep clicking, if it has repeat steps 1, 2 and 3.
The step I am unsure of how to do technically is step 1. What data structure would I need for a tile? Would a 2D array suffice? Store the value of each colour in this array and then determine if it is the object. Also in pseudo how would I split the screen up into tiles to be searched? The tile issue is my main problem.
EDIT for rayryeng 2:
I will be using Python for this task. This is not my game, I just want to create a macro to automatically perform a task for me in the game. I have no code yet, I am looking more for the ideas behind making this work than actual code.
3rd edit and final code:
#!/usr/bin/python
import win32gui, win32api
#function to take in coords and return colour
def colour_return(x,y):
colours = win32gui.GetPixel(win32gui.GetDC(win32gui.GetActiveWindow()), x,y)
return colours
def click(x,y):
win32api.SetCursorPos((x,y))
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
#variable declaration
x = 1
y = 1
pixel_value = []
colour_found = 0
while x < 1600:
pixel_value = colour_return(x,y)
if pixel_value == 1844766:
click(x,y)
x=x+1
#print x
print y
if x == 1600:
y=y+1
x=1
#print tile
pixel_value = 0
This is the final code that I have produced. It works but it is incredibly slow. It takes 30 seconds seconds to search all 1600 pixels of y=1. I guess it is my method that is not working. Instead of using histograms and tiles I am now just searching for a colour and clicking the coordinates when it matches. What is the fastest method to use when searching an entire screen for a certain colour? I've seen colour detection bots that manage to keep up every second with a moving character.

Related

Using shader modifiers to animate texture in SceneKit leads to jittery textures over time

I'm writing a scene in SceneKit for iOS.
I'm trying to apply a texture to an object using a sprite sheet. I iterate through the images in that sheet with this code:
happyMaterial = [SCNMaterial new];
happyMaterial.diffuse.contents = happyImage;
happyMaterial.diffuse.wrapS = SCNWrapModeRepeat;
happyMaterial.diffuse.wrapT = SCNWrapModeRepeat;
happyMaterial.shaderModifiers = #{ SCNShaderModifierEntryPointGeometry : #"_geometry.texcoords[0] = vec2((_geometry.texcoords[0].x+floor(u_time*30.0))/10.0, (_geometry.texcoords[0].y+floor(u_time*30.0/10.0))/7.0);" };
All is good. Except over time, the texture starts to get random jitteriness in it, especially along the x-axis.
Someone mentioned it could be because of "floating-point precision issues," but I'm not sure how to diagnose or fix this.
Also: I'm not sure how to log data from the shader code. Would be awesome to be able to look into variables like "u_time" and see exactly what's going on.

It's definitely a floating point precision issue. you should probably try to do a modulo on (u_time*30.0) so that it loops within a reasonable range.

if you want to iterate over images your texture coordinate must stay the same for a short period of time (1 second for instance).
u_time is similar to CACurrentMediaTime(), it's a time in seconds.
Now let's say you have N textures. Then mod(u_time, N) will increase every second from 0 to N-1 and then go back to 0. If you divide this by N you've got your texture coordinate, and you don't need SCNWrapModeRepeat.
If you want your image to change every 0.04 second (25 times per second), then use mod(25 * u_time, N) / N.

How to move image with low values?

The problem is simple: I want to move (and later, be able to rotate) an image. For example, every time i press the right arrow on my keyboard, i want the image to move 0.12 pixels to the right, and every time i press the left arrow key, i want the image to move 0.12 pixels to the left.
Now, I have multiple solutions for this:
1) simply add the incremental value, i.e.:
image.x += 0.12;
this is of course assuming that we're going to the right.
2) i multiplicate the value of a single increment by the times i already went into this particular direction + 1, like this:
var result:Number = 0.12 * (numberOfTimesWentRight+1);
image.x = result;
Both of these approaches work but produce similiar, yet subtly different, results. If we add some kind of button component that simply resets the x and y coordinates of the image, you will see that with the first approach the numbers don't add up correctly.
it goes from .12, .24, .359999, .475 etc.
But with the second approach it works well. (It's pretty obvious as to why though, it seems like += operations with Numbers are not really precise).
Why not use the second approach then? Well, i want to rotate the image as well. This will work for the first attempt, but after that the image will jump around. Why? In the second approach we never took the original position of the image in account. So if the origin-point shifts a bit down or up because you rotated your image, and THEN you try to move the image again: it will move to the same position as if you hadn't rotated before.
Alright, to make this short:
How can i reliably move, scale and rotate images for 1/10 of a pixel?

Short answer: I don't know! You're fighting with floating point math!
Luckily, I have a workaround, if you don't mind.
You store the location (x and y) of the image in a separate variable... at a larger scale. Such as 100x. So 123.45 becomes 12345, and you then divide by 100 to set the attribute that flash uses to display.
Yes, there are limits to number sizes too, but if you're willing to accept some error rate, and the fact that you'll be limited to, I dunno, a million pixels in each direction, you can fit it in a regular int. The only rounding error you will encounter will be a single rounding error when you divide by 100 (or the factor you used). So instead of the compound rounding error which you described (0.12 * 4 = 0.475), you should see things like 0.47999999. Which doesn't matter because it's, well, so small.

To expand on #Pimgd answer a bit, you're probably hitting a floating point error (multiple +='s will exaggerate the error more than one *='s) - Numbers in Flash are 53-bit precision.
There's also another thing to keep in mind, which is probably playing a bigger role with such small movement values; Flash positions all objects using twips, which is roughly about 1/20th of a pixel, or 0.05, so all values are rounded to this. When you say image.x += 0.12, it's actually the equivalent of image.x += 0.10, hence which the different becomes apparent; you're losing 0.02 of a pixel with every move.
You should be able to get around it by moving to another scale, as #Pimgd says, or just storing your position separately - i.e. work from a property _x rather than image.x so you're not losing that precision everytime:
this._x += 0.12;
image.x = this._x;

How can I use bwlabel or regionprops to extract set of pixels of each label?

I'm following this tutorial
The goal is to be able to spit out either:
a. the center of each labeled object
b. all pixels associated with each labeled object
in a way that I have an array of either 'a.' for each object, or 'b.' for each object
I'm really not sure how to go about this. Are there matlabl tools to help extract these set of pixels or centers - per - label?
Update
I did manage to circle 80% of what I wanted using reigionprops, however it doesn't capture label precisely, just sets a circle around them while capturing the background as well, is that really unavoidable? I'm just not sure how to access the set of pixel per each circled item.
r=regionprops(L, 'All'); imshow(imagergb); areas={r.Area}; Bboxes={r.BoundingBox};
for k=2:numel(r)
if areas{k}>50 && areas{k} < 1100
rectangle('Position',Bboxes{k}, 'LineWidth',1, 'EdgeColor','b', 'Curvature', [1 1]);
end
end
So what I'm trying to do is for example.
I thought it might just be
r = regionprops(L, 'PixelIdxList')
then
element1 = r(1).PixelIdxList
but couldn't figure out how to get the position of each pixel
I also tried
Z= bwlabel(L);
but imshow(Z==1) spits out all labels and imshow(Z==2) spits out background, all labels and background. couldn't test bwlabeln since I'm not exactly sure what to enter for r and c arguments.

Using regionprops(L, 'PixelIdxList') is correct. It gives you lists of pixel indices for each label. You can then convert them to [x,y] coordinates using (for the first label, for example)
[y,x] = ind2sub(size(L), r(1).PixelIdxList)
You can get label centers by using regionprops(L, 'Centroid'). This already gives you [x,y] coordinates for each label. Note that these are subpixel coordinates, so you may need to round them if you want to use them as indices.

OpenCV: Generating points from image after thinning

I've ran in to an issue concerning generating floating point coordinates from an image.
The original problem is as follows:
the input image is handwritten text. From this I want to generate a set of points (just x,y coordinates) that make up the individual characters.
At first I used findContours in order to generate the points. Since this finds the edges of the characters it first needs to be ran through a thinning algorithm, since I'm not interested in the shape of the characters, only the lines or as in this case, points.
Input:
thinning:
So, I run my input through the thinning algorithm and all is fine, output looks good. Running findContours on this however does not work out so good, it skips a lot of stuff and I end up with something unusable.
The second idea was to generate bounding boxes (with findContours), use these bounding boxes to grab the characters from the thinning process and grab all none-white pixel indices as "points" and offset them by the bounding box position. This generates even worse output, and seems like a bad method.
Horrible code for this:
Mat temp = new Mat(edges, bb);
byte roi_buff[] = new byte[(int) (temp.total() * temp.channels())];
temp.get(0, 0, roi_buff);
int COLS = temp.cols();
List<Point> preArrayList = new ArrayList<Point>();
for(int i = 0; i < roi_buff.length; i++)
{
if(roi_buff[i] != 0)
{
Point tempP = bb.tl();
tempP.x += i%COLS;
tempP.y += i/COLS;
preArrayList.add(tempP);
}
}
Is there any alternatives or am I overlooking something?
UPDATE:
I overlooked the fact that I need the points (pixels) to be ordered. In the method above I simply do scanline approach to grabbing all the pixels. If you look at the 'o' for example, it would grab first the point on the left hand side, then the one on the right hand side. I would need them to be ordered by their neighbouring pixels since I want to draw paths with the points later on (outside of opencv).
Is this possible?

You should look into implementing your own connected components labelling. The concept is very simple: you scan the first line and assign unique labels to each horizontally connected strip of pixels. You basically check for every pixel if it is connected to its left neighbour and assign it either that neighbour's label or a new label. In the second row you do the same, but you also check against the pixels above it. Sometimes you need a label merge: two strips that were not connected in the previous row are joined in the current row. The way to deal with this is either to keep a list of label equivalences or use pointers to labels (so you can easily do a complete label change for an object).
This is basically what findContours does, but if you implement it yourself you have the freedom to go for 8-connectedness and even bridge a single-pixel or two-pixel gap. That way you get "almost-connected components labelling". It looks like you need this for the "w" in your example picture.
Once you have the image labelled this way, you can push all the pixels of a single label to a vector, and order them something like this. Find the top left pixel, push it to a new vector and erase it from the original vector. Now find the pixel in the original vector closest to it, push it to the new vector and erase from the original. Continue until all pixels have been transferred.
It will not be very fast this way, but it should be a start.

Repeating 2d world

How to make a 2d world with fixed size, which would repeat itself when reached any side of the map?
When you reach a side of a map you see the opposite side of the map which merged togeather with this one. The idea is that if you didn't have a minimap you would not even notice the transition of map repeating itself.
I have a few ideas how to make it:
1) Keeping total of 3x3 world like these all the time which are exactly the same and updated the same way, just the players exists in only one of them.
2) Another way would be to seperate the map into smaller peaces and add them to required place when asked.
Either way it can be complicated to complete it. I remember that more thatn 10 years ago i played some game like that with soldiers following each other in a repeating wold shooting other AI soldiers.
Mostly waned to hear your thoughts about the idea and how it could be achieved. I'm coding in XNA(C#).

Another alternative is to generate noise using libnoise libraries. The beauty of this is that you can generate noise over a theoretical infinite amount of space.
Take a look at the following:
http://libnoise.sourceforge.net/tutorials/tutorial3.html#tile
There is also an XNA port of the above at: http://bigblackblock.com/tools/libnoisexna
If you end up using the XNA port, you can do something like this:
Perlin perlin = new Perlin();
perlin.Frequency = 0.5f; //height
perlin.Lacunarity = 2f; //frequency increase between octaves
perlin.OctaveCount = 5; //Number of passes
perlin.Persistence = 0.45f; //
perlin.Quality = QualityMode.High;
perlin.Seed = 8;
//Create our 2d map
Noise2D _map = new Noise2D(CHUNKSIZE_WIDTH, CHUNKSIZE_HEIGHT, perlin);
//Get a section
_map.GeneratePlanar(left, right, top, down);
GeneratePlanar is the function to call to get the sections in each direction that will connect seamlessly with the rest of your world.

If the game is tile based I think what you should do is:
Keep only one array for the game area.
Determine the visible area using modulo arithmetics over the size of the game area mod w and h where these are the width and height of the table.
E.g. if the table is 80x100 (0,0) top left coordinates with a width of 80 and height of 100 and the rect of the viewport is at (70,90) with a width of 40 and height of 20 you index with [70-79][0-29] for the x coordinate and [90-99][0-9] for the y. This can be achieved by calculating the index with the following formula:
idx = (n+i)%80 (or%100) where n is the top coordinate(x or y) for the rect and i is in the range for the width/height of the viewport.
This assumes that one step of movement moves the camera with non fractional coordinates.
So this is your second alternative in a little bit more detailed way. If you only want to repeat the terrain, you should separate the contents of the tile. In this case the contents will most likely be generated on the fly since you don't store them.
Hope this helped.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart