How to rotate an image in GNU Octave? - image-processing

I = imread("C:/Users/Hp/Desktop/download.jpg");
dir = [1 0 0]
J = rotate(I,dir,90);
This is the code i have written. but it doesnt seem to work.
Is there any direct command to rotate an image by say 45 degree or 90 degree.
it gives the following errors:
rotate: H must be an array of one or more graphics handles
I am new to image processing. Help would be appreciated.

rotate(h, dir, alpha) is for rotating graphical objects ie obects that are defined by sets of coordinates eg grids, surfaces etc...
To rotate images you need to use imrotate()

Related

Placing a shape inside another shape using opencv

I have two images and I need to place the second image inside the first image. The second image can be resized, rotated or skewed such that it covers a larger area of the other images as possible. As an example, in the figure shown below, the green circle need to be placed inside the blue shape:
Here the green circle is transformed such that it covers a larger area. Another example is shown below:
Note that there may be some multiple results. However, any similar result is acceptable as shown in the above example.
How do I solve this problem?
Thanks in advance!
I tested the idea I mentioned earlier in the comments and the output is almost good. It may be better but it takes time. The final code was too much and it depends on one of my old personal projects, so I will not share. But I will explain step by step how I wrote such an algorithm. Note that I have tested the algorithm many times. Not yet 100% accurate.
for N times do this:
1. Copy from shape
2. Transform it randomly
3. Put the shape on the background
4-1. It is not acceptable if the shape exceeds the background. Go to
the first step.
4.2. Otherwise we will continue to step 5.
5. We calculate the length, width and number of shape pixels.
6. We keep a list of the best candidates and compare these three
parameters (W, H, Pixels) with the members of the list. If we
find a better item, we will save it.
I set the value of N to 5,000. The larger the number, the slower the algorithm runs, but the better the result.
You can use anything for Transform. Mirror, Rotate, Shear, Scale, Resize, etc. But I used warpPerspective for this one.
im1 = cv2.imread(sys.path[0]+'/Back.png')
im2 = cv2.imread(sys.path[0]+'/Shape.png')
bH, bW = im1.shape[:2]
sH, sW = im2.shape[:2]
# TopLeft, TopRight, BottomRight, BottomLeft of the shape
_inp = np.float32([[0, 0], [sW, 0], [sW, sH], [0, sH]])
cx = random.randint(5, sW-5)
ch = random.randint(5, sH-5)
o = 0
# Random transformed output
_out = np.float32([
[random.randint(-o, cx-1), random.randint(1-o, ch-1)],
[random.randint(cx+1, sW+o), random.randint(1-o, ch-1)],
[random.randint(cx+1, sW+o), random.randint(ch+1, sH+o)],
[random.randint(-o, cx-1), random.randint(ch+1, sH+o)]
])
# Transformed output
M = cv2.getPerspectiveTransform(_inp, _out)
t = cv2.warpPerspective(shape, M, (bH, bW))
You can use countNonZero to find the number of pixels and findContours and boundingRect to find the shape size.
def getSize(msk):
cnts, _ = cv2.findContours(msk, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnts.sort(key=lambda p: max(cv2.boundingRect(p)[2],cv2.boundingRect(p)[3]), reverse=True)
w,h=0,0
if(len(cnts)>0):
_, _, w, h = cv2.boundingRect(cnts[0])
pix = cv2.countNonZero(msk)
return pix, w, h
To find overlaping of back and shape you can do something like this:
make a mask from back and shape and use bitwise methods; Change this section according to the software you wrote. This is just an example :)
mskMix = cv2.bitwise_and(mskBack, mskShape)
mskMix = cv2.bitwise_xor(mskMix, mskShape)
isCandidate = not np.any(mskMix == 255)
For example this is not a candidate answer; This is because if you look closely at the image on the right, you will notice that the shape has exceeded the background.
I just tested the circle with 4 different backgrounds; And the results:
After 4879 Iterations:
After 1587 Iterations:
After 4621 Iterations:
After 4574 Iterations:
A few additional points. If you use a method like medianBlur to cover the noise in the Background mask and Shape mask, you may find a better solution.
I suggest you read about Evolutionary Computation, Metaheuristic and Soft Computing algorithms for better understanding of this algorithm :)

Trouble implementing shadows in WebGL

I am trying to implement shadows into my WEBGL 2.0 Project using this tutorial
https://webgl2fundamentals.org/webgl/lessons/webgl-shadows.html
Currently I am getting really bad results like this:
Basically a ton of the terrain is being drawn in shadow that shouldn't be. The light projection is from your camera towards the direction you are looking so hypothetically you shouldn't be able to see any shdaows becuase the light projection is the same as your camera ( I am just doing this for testing until I can get this working properly)
I have everything the same as the tutorial I believe except I am using glMatrix instead of their matrix math library (shouldn't matter I would assume). Here's the thing though. I don't use a model view matrix for anything I am rendering so none of my points are on a -1,1 range. They can go out as far as -3200...ect Its just all one big terrain mesh chunked out.
I think the issue lies with how I am creating the texture matrix
textureMatrix = glMatrix.mat4.create();
glMatrix.mat4.translate(textureMatrix,textureMatrix,[0.5,0.5,0.5]);
glMatrix.mat4.scale(textureMatrix,textureMatrix,[0.5,0.5,0.5]);
glMatrix.mat4.multiply(textureMatrix,textureMatrix, projectionMatrix);
glMatrix.mat4.invert(lightMatrix,lightMatrix);
glMatrix.mat4.multiply(textureMatrix,textureMatrix, lightMatrix);
I am using the same matrix for the light projection as your normal projection, is that an issue? if anyone could help it would be greatly appreciated.
That's probably because the Y position of your light (in your example, it is much more the distance between the eye and the scene) is too big for the Z size of your shadow volume (the size of your shadow volume in the view direction.) Here if posY is inside the wireframe box :
But if you increase posY too much (i.e. your shapes get out of the shadow volume, they disappear
So you should increase the size of your shadow volume (or shrinken your scene, either way.) You cannot simulate that with the slider because they just give you the control to the two dimensions X and Y dimensions : projWidth and projHeight.
i.e. in the last code in your tutorial page, the latest parameter ("far") for example change it from 10 to 100
const lightProjectionMatrix = settings.perspective
? m4.perspective(
degToRad(settings.fieldOfView),
settings.projWidth / settings.projHeight,
0.5, // near
10) // far
: m4.orthographic(
-settings.projWidth / 2, // left
settings.projWidth / 2, // right
-settings.projHeight / 2, // bottom
settings.projHeight / 2, // top
0.5, // near
100); // far
Then you can increase posY far more :
without having your full code, it is hard to reproduce and help you. Could you not try to just inject your scene into the tutorial code ? You can bind the viewpoint with the source and orientation of the light by using the same inputs : (just adding 0.5 to X to see a bit of shadow and make sure it is properly computed.)
/*const cameraPosition = [settings.cameraX, settings.cameraY, 15];*/
const cameraPosition = [settings.posX+0.5, settings.posY, settings.posZ];
/*const target = [0, 0, 0]; */
const target = [settings.targetX, settings.targetY, settings.targetZ];

Pixel-perfect collisions in Monogame, with float positions

I want to detect pixel-perfect collisions between 2 sprites.
I use the following function which I have found online, but makes total sense to me.
static bool PerPixelCollision(Sprite a, Sprite b)
{
// Get Color data of each Texture
Color[] bitsA = new Color[a.Width * a.Height];
a.Texture.GetData(0, a.CurrentFrameRectangle, bitsA, 0, a.Width * a.Height);
Color[] bitsB = new Color[b.Width * b.Height];
b.Texture.GetData(0, b.CurrentFrameRectangle, bitsB, 0, b.Width * b.Height);
// Calculate the intersecting rectangle
int x1 = (int)Math.Floor(Math.Max(a.Bounds.X, b.Bounds.X));
int x2 = (int)Math.Floor(Math.Min(a.Bounds.X + a.Bounds.Width, b.Bounds.X + b.Bounds.Width));
int y1 = (int)Math.Floor(Math.Max(a.Bounds.Y, b.Bounds.Y));
int y2 = (int)Math.Floor(Math.Min(a.Bounds.Y + a.Bounds.Height, b.Bounds.Y + b.Bounds.Height));
// For each single pixel in the intersecting rectangle
for (int y = y1; y < y2; ++y)
{
for (int x = x1; x < x2; ++x)
{
// Get the color from each texture
Color colorA = bitsA[(x - (int)Math.Floor(a.Bounds.X)) + (y - (int)Math.Floor(a.Bounds.Y)) * a.Texture.Width];
Color colorB = bitsB[(x - (int)Math.Floor(b.Bounds.X)) + (y - (int)Math.Floor(b.Bounds.Y)) * b.Texture.Width];
if (colorA.A != 0 && colorB.A != 0) // If both colors are not transparent (the alpha channel is not 0), then there is a collision
{
return true;
}
}
}
//If no collision occurred by now, we're clear.
return false;
}
(all the Math.floor are useless, I copied this function from my current code where I'm trying to make it work with floats).
It reads the color of the sprites in the rectangle portion that is common to both sprites.
This actually works fine, when I display the sprites at x/y coordinates where x and y are int's (.Bounds.X and .Bounds.Y):
View an example
The problem with displaying sprites at int's coordinates is that it results in a very jaggy movement in diagonals:
View an example
So ultimately I would like to not cast the sprite position to int's when drawing them, which results in a smooth(er) movement:
View an example
The issue is that the PerPixelCollision works with ints, not floats, so that's why I added all those Math.Floor. As is, it works in most cases, but it's missing one line and one row of checking on the bottom and right (I think) of the common Rectangle because of the rounding induced by Math.Floor:
View an example
When I think about it, I think it makes sense. If x1 is 80 and x2 would actually be 81.5 but is 81 because of the cast, then the loop will only work for x = 80, and therefore miss the last column (in the example gif, the fixed sprite has a transparent column on the left of the visible pixels).
The issue is that no matter how hard I think about this, or no matter what I try (I have tried a lot of things) - I cannot make this work properly. I am almost convinced that x2 and y2 should have Math.Ceiling instead of Math.Floor, so as to "include" the last pixel that otherwise is left out, but then it always gets me an index out of the bitsA or bitsB arrays.
Would anyone be able to adjust this function so that it works when Bounds.X and Bounds.Y are floats?
PS - could the issue possibly come from BoxingViewportAdapter? I am using this (from MonoExtended) to "upscale" my game which is actually 144p.
Remember, there is no such thing as a fractional pixel. For movement purposes, it completely makes sense to use floats for the values and cast them to integer pixels when drawn. The problem is not in the fractional values, but in the way that they are drawn.
The main reason the collisions are not appearing to work correctly is the scaling. The colors for the new pixels in between the diagonals get their colors by averaging* the surrounding pixels. The effect makes the image appear larger than the original, especially on the diagonals.
*there are several methods that may be used for the scaling, bi-cubic and linear are the most common.
The only direct(pixel perfect) solution is to compare the actual output after scaling. This requires rendering the entire screen twice, and requires the scale factor more computations. (not recommended)
Since you are comparing the non-scaled images your collisions appear to be off.
The other issue is movement speed. If you are moving faster than one pixel per Update(), detecting per pixel collisions is not enough, if the movement is to be restricted by the obstacle. You must resolve the collision.
For enemies or environmental hazards your original code is sufficient and collision resolution is not required. It will give the player a minor advantage.
A simple resolution algorithm(see below for a mathematical solution) is to unwind the movement by half, check for collision. If it is still colliding, unwind the movement by a quarter, otherwise advance it by a quarter and check for collision. Repeat until the movement is less than 1 pixel. This runs log of Speed times.
As for the top wall not colliding perfectly: If the starting Y value is not a multiple of the vertical movement speed, you will not land perfectly on zero. I prefer to resolve this by setting the Y = 0, when Y is negative. It is the same for X, and also when X and Y > screen bounds - origin, for the bottom and right of the screen.
I prefer to use mathematical solutions for collision resolution. In your example images, you show a box colliding with a diamond, the diamond shape is represented mathematically as the Manhattan distance(Math.Abs(x1-x2) + Math.Abs(y1-y2)). From this fact, it is easy directly calculate the resolution to the collision.
On optimizations:
Be sure to check that the bounding Rectangles are overlapping before calling this method.
As you have stated, remove all Math.Floors, since, the cast is sufficient. Reduce all calculations inside of the loops not dependent on the loop variable outside of the loop.
The (int)a.Bounds.Y * a.Texture.Width and (int)b.Bounds.Y * b.Texture.Width are not dependent on the x or y variables and should be calculated and stored before the loops. The subtractions 'y-[above variable]` should be stored in the "y" loop.
I would recommend using a bitboard(1 bit per 8 by 8 square) for collisions. It reduces the broad(8x8) collision checks to O(1). For a resolution of 144x144, the entire search space becomes 18x18.
you can wrap your sprite with a rectangle and use its function called Intersect,which detedct collistions.
Intersect - XNA

Wrong result using function fillPoly in opencv for very large images

I have a hard time solving the issue with mask creation.My image is large,
40959px X 24575px and im trying to create a mask for it.
I noticed that i dont have a problem for images up to certain size(I tested about 33000px X 22000px), but for dimensions larger than that i get an error inside my mask(Error is that it gets black in the middle of the polygon and white region extends itself to the left edge.Result should be without black area inside polygon and no white area extending to the left edge of image).
So my code looks like this:
pixel_points_list = latLonToPixel(dataSet, lat_lon_pairs)
print pixel_points_list
# This is the list im getting
#[[213, 6259], [22301, 23608], [25363, 22223], [27477, 23608], [35058, 18433], [12168, 282], [213, 6259]]
image = cv2.imread(in_tmpImgFilePath,-1)
print image.shape
#Value of image.shape: (24575, 40959, 4)
mask = np.zeros(image.shape, dtype=np.uint8)
roi_corners = np.array([pixel_points_list], dtype=np.int32)
print roi_corners
#contents of roi_corners_array:
"""
[[[ 213 6259]
[22301 23608]
[25363 22223]
[27477 23608]
[35058 18433]
[12168 282]
[ 213 6259]]]
"""
channel_count = image.shape[2]
ignore_mask_color = (255,)*channel_count
cv2.fillPoly(mask, roi_corners, ignore_mask_color)
cv2.imwrite("mask.tif",mask)
And this is the mask im getting with those coordinates(minified mask):
You see that in the middle of the mask the mask is mirrored.I took those points from pixel_points_list and drawn them on coordinate system and im getting valid polygon, but when using fillPoly im getting wrong results.
Here is even simpler example where i have only 4(5) points:
roi_corners = array([[ 213 6259]
[22301 23608]
[35058 18433]
[12168 282]
[ 213 6259]])
And i get
Does anyone have a clue why does this happen?
Thanks!
The issue is in the function CollectPolyEdges, called by fillPoly (and drawContours, fillConvexPoly, etc...).
Internally, it's assumed that the point coordinates (of integer type int32) have meaningful values only in the 16 lowest bits. In practice, you can draw correctly only if your points have coordinates up to 32768 (which is exactly the maximum x coordinate you can draw in your image.)
This can't be considered as a bug, since your images are extremely large.
As a workaround, you can try to scale your mask and your points by a given factor, fill the poly on the smaller mask, and then re-scale the mask back to original size
As #DanMaĆĄek pointed out in the comments, this is in fact a bug, not fixed, yet.
In the bug discussion, there is another workaround mentioned. It consists on drawing using multiple ROIs with size less than 32768, correcting coordinates for each ROI using the offset parameter in fillPoly.

colour detection bot

I want to create a script to automatically click on a moving target on a game.
To do this I want to check colours on the screen to determine where my target is and then click him, repeating this if he moves.
I'm not skilled at programming and there might be an easier solution to what I have proposed below:
1/Split the screen into equal tiles - size of tile should represent the in game object.
2/Loop through each pixel of each tile and create a histogram of the pixel colours.
3/If the most common recorded colour matches what we need, we MIGHT have the correct tile. Save the coords and click the object to complete task
4/Every 0.5 seconds check colour to determine if the object has moved, if it hasnt, keep clicking, if it has repeat steps 1, 2 and 3.
The step I am unsure of how to do technically is step 1. What data structure would I need for a tile? Would a 2D array suffice? Store the value of each colour in this array and then determine if it is the object. Also in pseudo how would I split the screen up into tiles to be searched? The tile issue is my main problem.
EDIT for rayryeng 2:
I will be using Python for this task. This is not my game, I just want to create a macro to automatically perform a task for me in the game. I have no code yet, I am looking more for the ideas behind making this work than actual code.
3rd edit and final code:
#!/usr/bin/python
import win32gui, win32api
#function to take in coords and return colour
def colour_return(x,y):
colours = win32gui.GetPixel(win32gui.GetDC(win32gui.GetActiveWindow()), x,y)
return colours
def click(x,y):
win32api.SetCursorPos((x,y))
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
#variable declaration
x = 1
y = 1
pixel_value = []
colour_found = 0
while x < 1600:
pixel_value = colour_return(x,y)
if pixel_value == 1844766:
click(x,y)
x=x+1
#print x
print y
if x == 1600:
y=y+1
x=1
#print tile
pixel_value = 0
This is the final code that I have produced. It works but it is incredibly slow. It takes 30 seconds seconds to search all 1600 pixels of y=1. I guess it is my method that is not working. Instead of using histograms and tiles I am now just searching for a colour and clicking the coordinates when it matches. What is the fastest method to use when searching an entire screen for a certain colour? I've seen colour detection bots that manage to keep up every second with a moving character.

Resources