Image Recognition - Finding glowing object - image-processing

Most game anti-cheat use heuristic approach such as detecting known binaries signature or preventing third party library injection. But, Valve software use deep learning to combat cheat. Valve feed its AI with view angle, fire rate, etc. And its quite working good.
My question is, how do i make such thing but with images instead of data?
Consider this example
Not - Cheat :
Cheating :
Is it possible to make a model like that?

Well images are also just data.
You can seperate each image and its pixels into f.e. its raw numbers for like rgb.
This way you could model a network based on converted inputs of pixels from your image.
In this example, the pattern would probably just recognize huge spikes of those vibrant colors since those value will differ alot from the usual environment.
If the question aims to archive some kind of "visual cheat detection" and is not all about deep-learning, you could simply check the images pixels manually, if you know the color of your "cheat"-overlay or simply detect differences, and flag them this way.

Related

Recognising a drawn line using neural networks in a web app

Basically, I was weighing up some options for a software idea I had. The web app thing is a bit of a constraint on the project, so I'm assuming I would be writing this in js.
I need to create a drawable area for the user, which is okay, allow them to draw and then compare the input to a correct example. This is just an arrow, but the arrow can be double headed (normal point arrow) or single headed (half an arrowhead), so the minute details are fairly important, as is the location.
Now, I've read around for a few hours or so, and it seems to be that a good approach is to downsample the input so I am just comparing a couple of pixels. I am wondering though if there is a simpler way to achieve what I want here, and if there are good resources for learning what I feel is a very basic implementation of image recognition. Also having never implemented something like this, I'm a little worried about the little details of something like this, like speed; obviously feedback has to be fairly quick.
Thanks.
Use openCV. It already has the kind of use cases you want (location, style etc. of the image). There are many other open source libraries but not many as robust as this.
After that you have to decide all the possible images you want to make as the standard image, then get training examples for each of these standard images (each of these std images would be your one single class).
Now use the pixels as the features (openCV will do it for you with minimum help) and do your classification training. Not you have to provide these training images and have at least a good amount of training images for each class. Then use this trained classifier to classify the images that are drawn by your users. You can put GUI on top of it to adapt to your needs that you posted above.

How to rearrange images by pixel groups

I would like to create an image transition program. It should shift pixel areas from one image and transition them to another based on certain criteria, like colour and shape.
To do this, I need to be able to analyse the image, split it into groups, and shift these groups.
The first problem already starts with determining the pixel groups. They should not be chosen at random or perfect polygons/shapes. Does anyone know of an algorithm that can differentiate different textures/surroundings/borders?
Next, I need to do the slight adjustments to the areas in order to make them fit to the new image. Then the areas will be moved. That'll not be as hard as the first problem.
Performance doesn't matter that much; first I have to get the program working. It can take an hour to load the transition beforehand or whatever ;)
Could anyone give me some advice where to start or what technologies/APIs I could use? I'm fine with most programming languages, preferably C#, VB, JavaScript, PHP, Java, etc. The platform doesn't matter either.
I know, this is complex, but I gave my best to try to explain it. Any ideas?
Your first task, grouping according to color/texture/etc. is called segmentation. There are many approaches and algorithms to do it, and none is absolutely better than all other, as many things in image processing, the best algorithm depends on your image and your specific functional/artistic goal.
The general idea is to define multiple distances between pixels, like one distance would be based only on the position of pixels, another on the difference in their color, a more advanced metric could take the neighborhood into account to do something related to shape, contour orientations or texture. Then you would combine these distances (for example in a weighted sum) to get a "clever" measure of how similar two pixels are. After that you compute more or less exhaustively all distances and group similar pixels according to some thresholds (like how big the final groups are).
If you don't want to research and implement all that, you'd be better off using an existing image processing library. I suggest looking at OpenCV and the "segmentation" keyword. You'll get implementations of k-means, watershed and meanshift algorithms which are probably of interest for achieving your effect.
OpenCV is C++ but it also have bindings in Java and Python I think, and probably other.
For your second task, you need a mix of moving and blending pixels, but that's simpler and you can do it "by hand", or look at morphing algorithms.
A quick search revealed this blog post with a source code using OpenCV to morph two images. You also have some ready-made libraries in a few languages, have a look at related questions.
You could even directly call a command-line utility: xmorph but doesn't seem portable or imagemagick (see this script) which is more modern but not doesn't implement a real morphing algorithm AFAIK.

Sparse Image matching in iOS

I am building an iOS app that, as a key feature, incorporates image matching. The problem is the images I need to recognize are small orienteering 10x10 plaques with simple large text on them. They can be quite reflective and will be outside(so the light conditions will be variable). Sample image
There will be up to 15 of these types of image in the pool and really all I need to detect is the text, in order to log where the user has been.
The problem I am facing is that with the image matching software I have tried, aurasma and slightly more successfully arlabs, they can't distinguish between them as they are primarily built to work with detailed images.
I need to accurately detect which plaque is being scanned and have considered using gps to refine the selection but the only reliable way I have found is to get the user to manually enter the text. One of the key attractions we have based the product around is being able to detect these images that are already in place and not have to set up any additional material.
Can anyone suggest a piece of software that would work(as is iOS friendly) or a method of detection that would be effective and interactive/pleasing for the user.
Sample environment:
http://www.orienteeringcoach.com/wp-content/uploads/2012/08/startfinishscp.jpeg
The environment can change substantially, basically anywhere a plaque could be positioned they are; fences, walls, and posts in either wooded or open areas, but overwhelmingly outdoors.
I'm not an iOs programmer, but I will try to answer from an algorithmic point of view. Essentially, you have a detection problem ("Where is the plaque?") and a classification problem ("Which one is it?"). Asking the user to keep the plaque in a pre-defined region is certainly a good idea. This solves the detection problem, which is often harder to solve with limited resources than the classification problem.
For classification, I see two alternatives:
The classic "Computer Vision" route would be feature extraction and classification. Local Binary Patterns and HOG are feature extractors known to be fast enough for mobile (the former more than the latter), and they are not too complicated to implement. Classifiers, however, are non-trivial, and you would probably have to search for an appropriate iOs library.
Alternatively, you could try to binarize the image, i.e. classify pixels as "plate" / white or "text" / black. Then you can use an error-tolerant similarity measure for comparing your binarized image with a binarized reference image of the plaque. The chamfer distance measure is a good candidate. It essentially boils down to comparing the distance transforms of your two binarized images. This is more tolerant to misalignment than comparing binary images directly. The distance transforms of the reference images can be pre-computed and stored on the device.
Personally, I would try the second approach. A (non-mobile) prototype of the second approach is relatively easy to code and evaluate with a good image processing library (OpenCV, Matlab + Image Processing Toolbox, Python, etc).
I managed to find a solution that is working quite well. Im not fully optimized yet but I think its just tweaking filters, as ill explain later on.
Initially I tried to set up opencv but it was very time consuming and a steep learning curve but it did give me an idea. The key to my problem is really detecting the characters within the image and ignoring the background, which was basically just noise. OCR was designed exactly for this purpose.
I found the free library tesseract (https://github.com/ldiqual/tesseract-ios-lib) easy to use and with plenty of customizability. At first the results were very random but applying sharpening and monochromatic filter and a color invert worked well to clean up the text. Next a marked out a target area on the ui and used that to cut out the rectangle of image to process. The speed of processing is slow on large images and this cut it dramatically. The OCR filter allowed me to restrict allowable characters and as the plaques follow a standard configuration this narrowed down the accuracy.
So far its been successful with the grey background plaques but I havent found the correct filter for the red and white editions. My goal will be to add color detection and remove the need to feed in the data type.

How can I compare images of the same origin that were cropped?

Suppose I have an image file/URL, and I want my software to search it within a set of up to 100 images (or at least in that order of magnitude). The target image that the software should find should be the "same" image as the given image, but it should still be able to "forgive" slight processing on either of them (the two images may have been cropped differently, or they were compressed differently).
The question is - is this feasible a task, given that I won't have any of the images before the search is taking place (i.e., there won't be any indexing prior to the search.) Is it likely to work in subsecond time (remember that the compare set is quite small). And if feasible, which tools can I use for this task? This could be software components or even an online service (I can live with that for a proof of concept). Can OpenSURF help me here?
To focus my question further - I'm not asking which algorithms to use, at this point I would rather use an existing tool/API/service.
The target image that the software should find should be the "same" image as the given image, but it should still be able to "forgive" slight processing on either of them.
If "slight processing" doesn't involve rotation, but only "cropping", then simple cross-correlation should work, if there could be perspective correction, rotation, lens distortion correction, then things are more complicated.
I think this method is quite forgiving to slight color corrections. Anyway, you can always convert both images to grayscale and compare grayscale versions if you want.
To focus my question further - I'm not asking which algorithms to use, at this point I would rather use an existing tool/API/service.
You can start from cvMatchTemplate from OpenCV library (the link points to the C version of the API, but it's available also for C++ and Python). Use the cropped image as a template, and look for it in all your images.
If the images you compare have dark features on light backgrounds, you may benefit from using CV_TM_CCOEFF or CV_TM_CCOEFF_NORMED methods. They both subtract the average over the template area from both images. Normalized methods (CV_TM_*_NORMED) generally work better but are slower than their non-normalized counterparts.
You may consider to do some preprocessing with the images before the cross-correlation. If you normalize them first, the cross-correlation will be less sensitive to slight brightness/contrast modification. If you detect edges first, as suggested by #misha, you'll lose color/lightness information, but the results for contour overlapping will be much better.
jetxee set you off on the right track. However, if you simply use template matching, you can run into problems where the background interferes with your template matching result. For example, if your template is a building and your background is primarily light (e.g. desert sand), then the template matching will fail because the lighter background will always return a higher cross-correlation than the darker template. Here is an example of this problem.
The way you solve it is the same as what is in the link:
Perform edge-detection on both your template and the target image.
Throw original template and image away
Perform template detection using the edge-detected template and edge-detected target image
As far as forgiving slight processing, the edge detection step will take care of that. As long as the edges in the two images are not modified significantly (blurred, optically distorted), the approach will work.
I know you are not looking specifically for algorithms, but nonetheless, let me suggest the following which can accomplish exactly what you are trying to do, very efficiently...
For cropped versions of the same image, including rotation, the Fourier-Mellin transform or a log-polar transform (watch out for the artsy semi-nude drawing - good source however) will give you the translation, rotation and scale coefficients between the two images, allowing to to determine what operations were needed to go from one to the other.

Creating a 3D effect from a 2D image

I have a random 2D image. I would like to be able to present the image in 3D. This doesn't have to be very detailed, even if the image were arbitrarily broken into layers like a pop-up cutout from a children's book.
The goal would be that a given image would look normal when directly viewed but that if a viewer were to move/tilt left, right, up, down there would be a 3d effect.
This is similar but not exactly the same as this question here:
How to create 3D streoscopic images using MATLAB with image tool?
This is complete over-kill:
http://make3d.cs.cornell.edu/
And this is probably on the right track:
http://www.imagemagick.org/Usage/distorts/#perspective
My ideal implementation would be a automated PHP script with ImageMagick take is fed an image and spits out as a result either (in order of preference):
Images representing each layer, from
nearest to deepest (closer to the
childs pop-up book layer analogy)
5 images representing the said views
(direct, left, right, top, bottom)
Has this been done (either of the above ideal implementations), or does anyone know how to do all, or part, of this?
As far as the first part of your question is concerned, it sounds like your ideal implementation is http://make3d.cs.cornell.edu/, except that:
you want it simpler (return images from a fixed set of angles as opposed to a walkthrough)
you want it with imagemagick and PHP
I think that last restriction is unrealistic because there's a fair amount of maths and computer vision behind this kind of problem. Imagemagick will help you with lower level-image processing tasks like affine transforms, but it doesn't really provide the required higher-level computer vision functionality like 3D image reconstruction.
So my advice would be to try and work around that restriction somehow. If you implement the approach using more suitable tools (like C++ and OpenCV, for example, or Matlab, as the Make3D guys did), then you can wrap that in a CGI application so your PHP scripts can access it. Cornell (the authors of Make3D) had a similar thing going a while back, but it looks like they're not doing it any more.
For the second part of your question, the theory behind what you want to do has been fairly well-researched. See here for a list of depth estimation papers. Here is what things look like in source.

Resources