Custom Video File how much would I extract the images? - directx

I alot of people would recommend that why not go with Bink or use DirectShow in order to play a video or even ffmpeg. However, what are movies anyways - just images put all together with sound.
I've already created a program where I take a bunch of images and place them into the customize video file. The cool thing about this - is that I can easily place it on a quad. The issue I'm having is I can only extract one image from the custom video file. When I have more than one; I have problems, which I fully understand.
I have a index lookup table of all the images sizes then the raw images. The calculation I was following was:
offset = NumberOfImages + 1 * sizeof(long).
So, with the one image - if you'll perform the offset of finding the first image would be quite easy. During the for loop it always starts with 0 and and reaches the number of images which is 1. So, it would translate like this:
offset = 1 + 1 * 4 = 8.
So, now I know the offset just for one image which is great. However, a video is with a bunch of images all together. So, I've been thinking to myself...If there was a way to reach up to a certain point then stuff the read data inside a vector.
currentPosition = 0; //-- Set the current position to zero before looping through images in file.
for (UINT i = 0; i < elements; i++) {
long tblSz = (elements + 1) * sizeof(long); // elements + 1 * 4
long off = tblSz + currentPosition; // -- Now let's calculate the offset position inside the file knowing the table size.
// in.seekg(off, std::ios_base::end); //-- Not used.
long videoSz = sicVideoIndexTable[i]; //-- Let's retreive the image size from the index table that's stored inside the file before we process each image.
// in.seekg(0, std::ios_base::beg); //-- Not used.
dataBuf.resize(videoSz); //-- Let's resize the data Buffer vector to fit the image size.
in.seekg(off, std::ios_base::beg); //-- Let's go to the calculated offset position to retrieve the image data.
std::streamsize currpos = in.gcount(); //-- Prototype not used.
in.read(&dataBuf[0], videoSz); //-- Let's read in the data according to the image size.
sVideoDesc.dataPtr = (void*)&dataBuf[0]; //-- Pass what we've read into the temporary structor before pushing it inside a vector to store the collection of images.
sVideoDesc.fileSize = videoSz;
sicVideoArray.push_back(sVideoDesc);
dataBuf.empty(); //-- Now can empty the data vector so it can be reused.
currentPosition = videoSz; //-- Set the current position to the video size so it can recalculate the offset for the next image.
}
I believe the problem lies within the seekg and in.read but that's just my gut telling me that. As you see the current position always changes.
Buttom line question is if I can load one image then why won't I be able to load multiple images from the custom video file? I'm not sure if I'm using seekg or should I just get every character until a certain point them dump the content inside a data buffer vector. I thought reading the block of data would be the answer - but I'm becoming very unsure.

I think I finally understand what your code does. You really should use more descriptive variable names. Or at least add an explanation of what each variable means. Anyway...
I believe your problem is in this line:
currentPosition = videoSz;
When it should be
currentPosition += videoSz;
You basically don't advance through your file.
Also, if you just read the images in sequentially, you might want to change your file format so that instead of a table of image sizes at the beginning, you store each image size directly followed by the image data. That way you don't need to do any of the offset calculations or seeking.

Related

Buffer data in Simulink in continuous time

I need to buffer some signals for a fixed duration to be used within the simulation. The use of buffer block in Simulink requires the frame rate to be known. However, I am using a continuous time solver (with defined maximum step size) so I don't really know how much should I put the buffer size as. There does not seem to be any option wherein a trigger based on time can be used. Can someone suggest how this can be done?
A simple buffer, made using a MATLAB Function Block, that would always have the most recent element at the top, would be,
function y = buffer(x)
% initialize the buffer
y = zeros(100,1);
% Shuffle the elements down
y(2:end) = y(1:end-1);
% add the new element
y(1) = x;

colour detection bot

I want to create a script to automatically click on a moving target on a game.
To do this I want to check colours on the screen to determine where my target is and then click him, repeating this if he moves.
I'm not skilled at programming and there might be an easier solution to what I have proposed below:
1/Split the screen into equal tiles - size of tile should represent the in game object.
2/Loop through each pixel of each tile and create a histogram of the pixel colours.
3/If the most common recorded colour matches what we need, we MIGHT have the correct tile. Save the coords and click the object to complete task
4/Every 0.5 seconds check colour to determine if the object has moved, if it hasnt, keep clicking, if it has repeat steps 1, 2 and 3.
The step I am unsure of how to do technically is step 1. What data structure would I need for a tile? Would a 2D array suffice? Store the value of each colour in this array and then determine if it is the object. Also in pseudo how would I split the screen up into tiles to be searched? The tile issue is my main problem.
EDIT for rayryeng 2:
I will be using Python for this task. This is not my game, I just want to create a macro to automatically perform a task for me in the game. I have no code yet, I am looking more for the ideas behind making this work than actual code.
3rd edit and final code:
#!/usr/bin/python
import win32gui, win32api
#function to take in coords and return colour
def colour_return(x,y):
colours = win32gui.GetPixel(win32gui.GetDC(win32gui.GetActiveWindow()), x,y)
return colours
def click(x,y):
win32api.SetCursorPos((x,y))
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
#variable declaration
x = 1
y = 1
pixel_value = []
colour_found = 0
while x < 1600:
pixel_value = colour_return(x,y)
if pixel_value == 1844766:
click(x,y)
x=x+1
#print x
print y
if x == 1600:
y=y+1
x=1
#print tile
pixel_value = 0
This is the final code that I have produced. It works but it is incredibly slow. It takes 30 seconds seconds to search all 1600 pixels of y=1. I guess it is my method that is not working. Instead of using histograms and tiles I am now just searching for a colour and clicking the coordinates when it matches. What is the fastest method to use when searching an entire screen for a certain colour? I've seen colour detection bots that manage to keep up every second with a moving character.

OpenCV: Generating points from image after thinning

I've ran in to an issue concerning generating floating point coordinates from an image.
The original problem is as follows:
the input image is handwritten text. From this I want to generate a set of points (just x,y coordinates) that make up the individual characters.
At first I used findContours in order to generate the points. Since this finds the edges of the characters it first needs to be ran through a thinning algorithm, since I'm not interested in the shape of the characters, only the lines or as in this case, points.
Input:
thinning:
So, I run my input through the thinning algorithm and all is fine, output looks good. Running findContours on this however does not work out so good, it skips a lot of stuff and I end up with something unusable.
The second idea was to generate bounding boxes (with findContours), use these bounding boxes to grab the characters from the thinning process and grab all none-white pixel indices as "points" and offset them by the bounding box position. This generates even worse output, and seems like a bad method.
Horrible code for this:
Mat temp = new Mat(edges, bb);
byte roi_buff[] = new byte[(int) (temp.total() * temp.channels())];
temp.get(0, 0, roi_buff);
int COLS = temp.cols();
List<Point> preArrayList = new ArrayList<Point>();
for(int i = 0; i < roi_buff.length; i++)
{
if(roi_buff[i] != 0)
{
Point tempP = bb.tl();
tempP.x += i%COLS;
tempP.y += i/COLS;
preArrayList.add(tempP);
}
}
Is there any alternatives or am I overlooking something?
UPDATE:
I overlooked the fact that I need the points (pixels) to be ordered. In the method above I simply do scanline approach to grabbing all the pixels. If you look at the 'o' for example, it would grab first the point on the left hand side, then the one on the right hand side. I would need them to be ordered by their neighbouring pixels since I want to draw paths with the points later on (outside of opencv).
Is this possible?
You should look into implementing your own connected components labelling. The concept is very simple: you scan the first line and assign unique labels to each horizontally connected strip of pixels. You basically check for every pixel if it is connected to its left neighbour and assign it either that neighbour's label or a new label. In the second row you do the same, but you also check against the pixels above it. Sometimes you need a label merge: two strips that were not connected in the previous row are joined in the current row. The way to deal with this is either to keep a list of label equivalences or use pointers to labels (so you can easily do a complete label change for an object).
This is basically what findContours does, but if you implement it yourself you have the freedom to go for 8-connectedness and even bridge a single-pixel or two-pixel gap. That way you get "almost-connected components labelling". It looks like you need this for the "w" in your example picture.
Once you have the image labelled this way, you can push all the pixels of a single label to a vector, and order them something like this. Find the top left pixel, push it to a new vector and erase it from the original vector. Now find the pixel in the original vector closest to it, push it to the new vector and erase from the original. Continue until all pixels have been transferred.
It will not be very fast this way, but it should be a start.

locating a change between two images

I have two images that are similar, but one has a additional change on it. What I need to be able to do is locate the change between the two images. Both images have white backgrounds and the change is a line being draw. I don't need anything as complex as openCV I'm looking for a "simple" solution in c or c++.
If you just want to show the differences, so you can use the code below.
FastBitmap original = new FastBitmap(bitmap);
FastBitmap overlay = new FastBitmap(processedBitmap);
//Subtract the original with overlay and just see the differences.
Subtract sub = new Subtract(overlay);
sub.applyInPlace(original);
// Show the results
JOptionPane.showMessageDialog(null, original.toIcon());
For compare two images, you can use ObjectiveFideliy class in Catalano Framework.
Catalano Framework is in Java, so you can port this class in another LGPL project.
https://code.google.com/p/catalano-framework/
FastBitmap original = new FastBitmap(bitmap);
FastBitmap reconstructed = new FastBitmap(processedBitmap);
ObjectiveFidelity of = new ObjectiveFidelity(original, reconstructed);
int error = of.getTotalError();
double errorRMS = of.getErrorRMS();
double snr = of.getSignalToNoiseRatioRMS();
//Show the results
Disclaimer: I am the author of this framework, but I thought this would help.
Your description leaves me with a few unanswered questions. It would be good to see some example before/after images.
However at the face of it, assuming you just want to find the parameters of the added line, it may be enough to convert the frames to grey-scale, subtract them from one another, segment the result to black & white and then perform line segment detection.
If the resulting image only contains one straight line segment, then it might be enough to find the bounding box around the remaining pixels, with a simple check to determine which of the two possible line segments you have.
However it would probably be simpler to use one of the Hough Transform methods provided by OpenCV.
You can use memcmp() (Ansi C function to compare 2 memory blocks, much like strcmp()). Just activate it on the Arrays of pixels and it returns whether they are identical or not.
You can add a little tweak that you get as result the pointer to the memory block where the first change occurred. This will give you a pointer to the first pixel. You can than just go along its neighbors to find all the non white pixels (representing your line).
bool AreImagesDifferent(const char*Im1, const char* Im2, const int size){
return memcmp(Im1,Im2,size);
}
const char* getFirstDifferentPixel(const char*Im1, const char* Im2, const int size){
const char* Im1end = Im1+size;
for (;Im1<Im1end; Im1++, Im2++){
if ((*Im1)!=(*Im2))
return Im1;
}
}

Repeating 2d world

How to make a 2d world with fixed size, which would repeat itself when reached any side of the map?
When you reach a side of a map you see the opposite side of the map which merged togeather with this one. The idea is that if you didn't have a minimap you would not even notice the transition of map repeating itself.
I have a few ideas how to make it:
1) Keeping total of 3x3 world like these all the time which are exactly the same and updated the same way, just the players exists in only one of them.
2) Another way would be to seperate the map into smaller peaces and add them to required place when asked.
Either way it can be complicated to complete it. I remember that more thatn 10 years ago i played some game like that with soldiers following each other in a repeating wold shooting other AI soldiers.
Mostly waned to hear your thoughts about the idea and how it could be achieved. I'm coding in XNA(C#).
Another alternative is to generate noise using libnoise libraries. The beauty of this is that you can generate noise over a theoretical infinite amount of space.
Take a look at the following:
http://libnoise.sourceforge.net/tutorials/tutorial3.html#tile
There is also an XNA port of the above at: http://bigblackblock.com/tools/libnoisexna
If you end up using the XNA port, you can do something like this:
Perlin perlin = new Perlin();
perlin.Frequency = 0.5f; //height
perlin.Lacunarity = 2f; //frequency increase between octaves
perlin.OctaveCount = 5; //Number of passes
perlin.Persistence = 0.45f; //
perlin.Quality = QualityMode.High;
perlin.Seed = 8;
//Create our 2d map
Noise2D _map = new Noise2D(CHUNKSIZE_WIDTH, CHUNKSIZE_HEIGHT, perlin);
//Get a section
_map.GeneratePlanar(left, right, top, down);
GeneratePlanar is the function to call to get the sections in each direction that will connect seamlessly with the rest of your world.
If the game is tile based I think what you should do is:
Keep only one array for the game area.
Determine the visible area using modulo arithmetics over the size of the game area mod w and h where these are the width and height of the table.
E.g. if the table is 80x100 (0,0) top left coordinates with a width of 80 and height of 100 and the rect of the viewport is at (70,90) with a width of 40 and height of 20 you index with [70-79][0-29] for the x coordinate and [90-99][0-9] for the y. This can be achieved by calculating the index with the following formula:
idx = (n+i)%80 (or%100) where n is the top coordinate(x or y) for the rect and i is in the range for the width/height of the viewport.
This assumes that one step of movement moves the camera with non fractional coordinates.
So this is your second alternative in a little bit more detailed way. If you only want to repeat the terrain, you should separate the contents of the tile. In this case the contents will most likely be generated on the fly since you don't store them.
Hope this helped.

Resources