i wander how code a shader that output a texture T1 in which each texel store the coordinate of all pixels that are (for example) not black of a given texture T0.
It's a bit difficult to know if A,B,C in your question represent pixels or sub images. If you are trying to locate pixels with values and group them together.
You can implement that as a variation of a pixel sort algorithm. There are many ways to do this but you can see e.g.
https://bl.ocks.org/zz85/cafa1b8b3098b5a40e918487422d47f6
or
https://timseverien.com/posts/2017-08-17-sorting-pixels-with-webgl/
which uses per frame odd-even pair-wise comparison.
Currently I am learning dense optical flow by myself. To understand it, I conduct one experiment. I produce one image using Matlab. One box with a given grays value is placed under one uniform background and the box is translated two pixels in x and y directions in another image. The two images are input into the implementation of the algorithm called TV-L1. The generated motion vector outer of the box is not zero. Is the reason that the gradient outer of the box is zero? Is the values filled in from the values with large gradient value?
In Horn and Schunck's paper, it reads
In parts of the image where the brightness gradient is zero, the velocity
estimates will simply be averages of the neighboring velocity estimates. There
is no local information to constrain the apparent velocity of motion of the
brightness pattern in these areas.
The progress of this filling-in phenomena is similar to the propagation effects
in the solution of the heat equation for a uniform flat plate, where the time rate of change of temperature is proportional to the Laplacian.
Is it not possible to obtain correct motion vectors for pixels with small gradients? Or the experiment is not practical. In practical applications, this doesn't happen.
Yes, in so called homogenous image regions with very small gradients no information where a motion can dervided from exists. That's why the motion from your rectangle is propagated outer the border. If you give your background a texture this effect will be less dominant. I know such problem when it comes to estimate the ego-motion of a car. Then the streat makes a lot of problems cause of here homogenoutiy.
Two pioneers in this field Lukas&Kanade (LK) and Horn&Schunch (HS) are developed methods for computing Optical Flow (OF). Both rely on brightness constancy assumption which feature location pixel values between two sequence frames not change. This constraint may be expressed as two equations: I(x+dx,y+dy,t+dt)=I(x,y,t) and ∂I/∂x dx+∂I/∂y dy+∂I/∂t dt=0 by using a Taylor series expansion I(x+dx,y+dy,t+dt) , we get (x+dx,y+dy,t+dt)=I(x,y,t)+∂I/∂x dx+∂I/∂y dy+∂I/∂t dt… letting ∂x/∂t=u and ∂y/∂t=v and combining these equations we get the OF constraint equation: ∂I/∂t=∂I/∂t u+∂I/∂t v . The OF equation has more than one solution, so the different techniques diverge here. LK equations are derived assuming that pixels in a neighborhood of each tracked feature move with the same velocity as the feature. In OpenCV, to catch large motions with a small window size (to keep the “same local velocity” assumption).
I'm developing an image warping iOS app with OpenGL ES 2.0.
I have a good grasp on the setup, the pipeline, etc., and am now moving along to the math.
Since my experience with image warping is nil, I'm reaching out for some algorithm suggestions.
Currently, I'm setting the initial vertices at points in a grid type fashion, which equally divide the image into squares. Then, I place an additional vertex in the middle of each of those squares. When I draw the indices, each square contains four triangles in the shape of an X. See the image below:
After playing with photoshop a little, I noticed adobe uses a slightly more complicated algorithm for their puppet warp, but a much more simplified algorithm for their standard warp. What do you think is best for me to apply here / personal preference?
Secondly, when I move a vertex, I'd like to apply a weighted transformation to all the other vertices to smooth out the edges (instead of what I have below, where only the selected vertex is transformed). What sort of algorithm should I apply here?
As each vertex is processed independently by the vertex shader, it is not easy to have vertexes influence each other's positions. However, because there are not that many vertexes it should be fine to do the work on the CPU and dynamically update your vertex attributes per frame.
Since what you are looking for is for your surface to act like a rubber sheet as parts of it are pulled, how about going ahead and implementing a dynamic simulation of a rubber sheet? There are plenty of good articles on cloth simulation in full 3D such as Jeff Lander's. Your application could be a simplification of these techniques. I have previously implemented a simulation like this in 3D. I required a force attracting my generated vertexes to their original grid locations. You could have a similar force attracting vertexes to the pixels at which they are generated before the simulation is begun. This would make them spring back to their default state when left alone and would progressively reduce the influence of your dragging at more distant vertexes.
I have a 3D shape in a 3D binary image. Therefore, I have a list of all of the x,y,z points.
If I am to analyze a shape for various identification, such as "sphericity", "spiky"-ness, volume, surface area, etc., what are some of the choices do I have here?
Could you post a sample shape? Do you have a complete set of points on the surface and interior of the shape? Are the points evenly spaced? Is this synthetic data, or perhaps a point cloud from a 3D scan?
A few ideas:
Calculate the 3D convex hull of the points. This will give you the outer "envelope" of the points and is useful for comparison with other measurements. For example, you can compare the surface area of the convex hull to the surface area of the outer surface points.
Find the difference between "on" voxels in the convex hull and "on" voxels in the raw point set. You can then determine how many points are different, whether there is one big clump, etc. If the original shape is a doughnut, the convex hull will be a disk, and the difference will be the shape of the hole.
To calculate spikiness, you can think of comparing the Euclidean distance between two points (the "straight line" distance) and the shortest distance on the outer surface between those two points.
Compare the surface area of the raw data to the surface area after a 3D morphological "close" operation or some other smoothing operation.
To suggest a type of volume calculation, we'd need to know more about the point set.
Consider the Art Gallery Problem to 3D. Are there points on the surface not visible to certain points in the interior? Is the shape convex or star convex?
http://en.wikipedia.org/wiki/Art_gallery_problem
http://en.wikipedia.org/wiki/Star-convex_set
A good reference for geometric algorithms is Geometric Tools for Computer Graphics by Schneider and Eberly. It's pricey new, but you can probably find a cheap used copy in good condition at addall.com. I suspect you'll find all the answers you want and more in that book.
http://www.amazon.com/Geometric-Computer-Graphics-Morgan-Kaufmann/dp/1558605940
One of the authors maintains a site on the same subject:
http://www.geometrictools.com/
Another good textbook is Computational Geometry in C by Joseph O'Rourke.
http://www.amazon.com/Computational-Geometry-Cambridge-Theoretical-Computer/dp/0521649765/ref=sr_1_1?s=books&ie=UTF8&qid=1328939654&sr=1-1
I've read across several Image Processing books and websites, but I'm still not sure the true definition of the term "energy" in Image Processing. I've found several definition, but sometimes they just don't match.
When we say "energy" in Image processing, what are we implying?
The energy is a measure the localized change of the image.
The energy gets a bunch of different names and a lot of different contexts but tends to refer to the same thing. It's the rate of change in the color/brightness/magnitude of the pixels over local areas. This is especially true for edges of the things inside the image and because of the nature of compression, these areas are the hardest to compress and therefore it's a solid guess that these are more important, they are often edges or quick gradients. These are the different contexts but they refer to the same thing.
The seam carving algorithm uses determinations of energy (uses gradient magnitude) to find the least noticed if removed. JPEG represents the local cluster of pixels relative to the energy of the first one. The Snake algorithm uses it to find the local contoured edge of a thing in the image. So there's a lot of different definitions but they all refer to the sort of oomph of the image. Whether that's the sum of the local pixels in terms of the square of absolute brightness or the hard bits to compress in a jpeg, or the edges in Canny Edge detection or the gradient magnitude:
The important bit is that energy is where the stuff is.
The energy of an image more broadly is the distances of some quality between the pixels of some locality.
We can take the sum of the LABdE2000 color distances within a properly weighted 2d gaussian kernel. Here the distances are summed together, the locality is defined by a gaussian kernel and the quality is color and the distance is LAB Delta formula from the year 2000 (Errata: previously this claimed E stood for Euclidean but the distance for standard delta E is Euclidean but the 94 and 00 formulas are not strictly Euclidean and the 'E' stands for Empfindung; German for "sensation"). We could also add up the local 3x3 kernel of the local difference in brightness, or square of brightness etc. We need to measure the localized change of the image.
In this example, local is defined as a 2d gaussian kernel and the color distance as LabDE2000 algorithm.
If you took an image and moved all the pixels and sorted them by color for some reason. You would reduce the energy of the image. You could take a collection of 50% black pixels and 50% white pixels and arrange them as random noise for maximal energy or put them as two sides of the image for minimum energy. Likewise, if you had 100% white pixels the energy would be 0 no matter how you arranged them.
It depends on the context, but in general, in Signal Processing, "energy" corresponds to the mean squared value of the signal (typically measured with respect to the global mean value). This concept is usually associated with the Parseval theorem, which allows us to think of the total energy as distributed along "frequencies" (and so one can say, for example, that a image has most of its energy concentrated in low frequencies).
Another -related- use is in image transforms: for example, the DCT transform (basis of the JPEG compression method) transforms a blocks of pixels (8x8 image) into a matrix of transformed coefficients; for typical images, it results that, while the original 8x8 image has its energy evenly distributed among the 64 pixels, the transformed image has its energy concentrated in the left-upper "pixels" (which, again, correspond to "low frequencies", in some analagous sense).
Energy is a fairly loose term used to describe any user defined function (in the image domain).
The motivation for using the term 'Energy' is that typical object detection/segmentation tasks are posed as a Energy minimization problem. We define an energy that would capture the solution we desire and perform gradient-descent to compute its lowest value, resulting in a solution for the image segmentation.
There is more than one definition of "energy" in image processing, so it depends on the context of where it was used.
Energy is used to describe a measure of "information" when formulating an operation under a probability framework such as MAP (maximum a priori) estimation in conjunction with Markov Random Fields. Sometimes the energy can be a negative measure to be minimised and sometimes it is a positive measure to be maximized.
If you consider that (for natural images captured by cameras) the light is an energy, you may call energy the value of the pixel on some channel.
However, I think that by energy the books are referring to the spectral density. From wikipedia:
The energy spectral density describes how the energy (or variance) of a signal or a time series is distributed with frequency
http://en.wikipedia.org/wiki/Spectral_density
Going back to my chemistry - Energy and Entropy are closely related terms. And Entropy and Randomness are also closely related. So in Image Processing, Energy might be similar to Randomness. For example, a picture of a plain wall has low energy, while the picture of a city taken from a helicopter might have high energy.
Image "energy" should be inversely proportional to Shannon entropy of image. But as already said image energy is loosely coupled term, it is better use "compressibility" term instead. That is - high image "energy" should correspond to high image compressibility.
http://lcni.uoregon.edu/~mark/Stat_mech/thermodynamic_entropy_and_information.html
Energy is like the "information present on the image". Compression of images cause energy-loss. I guess its something like that.
Energy is defined based on a normalized histogram of the image. Energy shows how the gray levels are distributed. When the number of gray levels is low then energy is high.
The Snake algorithm an image processing technique used to determine the contour of an object, the snake is nothing but a vector of (X,Y) points with some constraints, its final goal is to surround the object and describe its shape (contour) and then to track or represent the object by its shape.
The algorithm has two kinds of energies, internal and external.
Internal energy (the snake energy) (IE) is a user defined energy which acts on the snake (internally) to impose constraints on smoothness of the snake, without such a force, the snake shape will end up with the exact shape of the object, this is not desirable, because the exact shape of an object is very difficult to be obtained, due to light conditions, quality of image, noise, etc.
The external energy (EE) arises from the data (the image intensities), and it is nothing but the absolute difference of the intensities in the x and y directions (the intensity gradient) multiplied by -1, to be summed with internal energy, because the total energy must be minimized. so the total energy for all of the snake point should be minimized, Ideally, this come true when there are edges, because the gradient on the edge or (EE) is maximized, and since it is multiplied by -1, the total energy of the snake around the nearest object is minimized, and thus the algorithm converges to a solution, which is hopefully the true contour of the studied object.
because this algorithm relies on EE which is not only high on edges but also high on noisy points, sometimes the snake algorithm does not converge to an optimal solution, that why it is an approximate greedy algorithm.
I found this at Image Processing book;
Energy: S_N = sum (from b=0 to b=L-1) of abs(P(b))^2
P(b) = N(b) / M
where M represents the total number of pixels in a neighborhood window centered
about (j,k), and N(b) is the number of pixels of amplitude in the same window.
It may give us a better understand if we see this equation with entropy;
Entropy: S_E = - sum (from b=0 to b=L-1) of P(b)log2{P(b)}
source: Pp. 538~539 Digital Image Processing written by William K. Pratt (4th edition)
For my current imaging project, which is rendering a diffuse light source, I'd like to consider energy as light energy, or radiation energy. Question I had initially: does an RGB "pixel value" represent light energy ? It could be asserted using a light intensity meter and generating subsequent screens with gray pixel values (n,n,n) for 0..255. According to matlabs forum, the radiated energy of 1 greyscale pixel is always proportional to its pixel value, but pixel to pixel it will vary slightly.
There is another assumption regarding energy: while performing the forward ray tracing, I yield a ray count on each sampled position hit. This ray count is, or preferably should be, proportional to radiation energy that would hit the target at this position. In order to be able to compare it to actual photographs taken, I'd have to normalize the ray count to some pixel value range..(?) I enclose an example below, the energy source is a diffuse light emitter inside a dark cylinder.
Energy in signal processing is the integral of the signal square within signal boundaries. An analogy could be made that involves two dimensional signals and you can square pixel values and sum for all the pixels.
Image Energy is calculated through MATLAB using:
image_energy = graycoprops(i1, {'energy'})