the histograms in Joint Plot - hue

I want to plot a Scatterplot using joint plot where I am using Hue. But the KDE plots on the axes are not desired. Instead of those, I need to plot the histogram containing the aggregate count in the range of specific x and y. How can I do that?
In the above Image, instead of these kde plots, I need histogram plotted for aggregate number of count of points in that x and y range

Related

How can i apply a machine learning model to predict scatter plot data of a non linear 2D shape?

I created a dataset with mathlab, dataset has data for 5000 random non-linear 2D shapes with 100 known point each and their scatter plot.
Sheet imag has data for imaginary coordinates of nonlinear shapes, there're 100 points for each shape as columns.
Sheet real has data for real coordinates of nonlinear shapes, there're 100 points for each shape as columns.
Sheet x has data for x coordinates of scatter plots, there're 100 points for each shapes scatter plot as columns.
Sheet y has data for y coordinates of scatter plots, there're 100 points for each shapes scatter plot as columns.
Keep in mind that cell [i,j] of sheet imag and [i,j] of sheet real is the coordinate for i'th point of j'th shape, cell [i,j] of sheet x and cell [i,j] for sheet y is the coordinate for i'th point of j'th shape's scattered point.
For example real's [0,1] is -0,894608922831653, imag's [0,1] is -0,176637219642649, x's [0,1] is 1,00827206904887 is 0,987842977394785, y's [0,1] 0,0634351017253583; this means first shape's second point is (-0,894608922831653 -0,176637219642649j) and it's scattered to (1,00827206904887, 0,987842977394785)
I want to create a machine learning model to train with this dataset. Which method should i use and any tips on suggested method would be great.
I tried a regression model for start but results were way off, i was sort of expecting it because of complex numbers. I'm thinking about k nearest neighbor now but i'm open for everything.

Combine two datasets with each their own X and Y axes in same plot in Tableau

We have two datasets that each consist of an X and an Y axis. The two X and the two Y axes have the same scaling (millimeters) but the values of course differ. So for the X values in dataset 1 there are no corresponding values in dataset 2.
If we just put the plots into one plot with dual X and dual Y axes, the two datasets are somehow combined into four different plots, one for each combination of the X and Y axes. Like we want the plot for X1/Y1 and for X2/Y2. What we're also seeing are plots X1/Y2 and X2/Y1 which do not make any sense at all.
How do we correctly combine the two datasets into a single plot where they only share the same X and Y axes but do not mix like that?
The easiest solution is to combine and reshape your data to have 3 columns;X, Y, Type - where Type distinguishes between the data sets, could be actual vs predicted for example. Then just put x (as a continuous dimension) on columns, y on rows and type on color or detail.
You can reshape the data like this using the UNION feature when defining your data source

Similarity metric for 3D histograms

I want to cluster images based on colour similarity. For that I need a good similarity metric between two 3D histograms. A 3D histogram of an image is just a 3 dimensional space where each axis represents one of the base colours. The range of each axis is 0-255 since this are the possible values of the base colours for each pixel.
The histogram is represented as a 256X256X256 matrix and each entry in the matrix represents the count of pixels with that specific colour in the image. For example:
If the value of the matrix element M[0][0][0] = 1150 it means that there are 1150 black pixels in the image (RGB(0,0,0) represents the colour black)
I am looking for the most sensible similarity metric for this kind of problem. The metric will be used in the clustering algorithm (DBSCAN probably) to evaluate image similarity.
Use the L*a*b* (CIELAB) color space, where euclidean distance is indeed similarity, as it is designed to model human eye perceptions non-linearities.

Bhattacharya distance between R,G,B Y Cb Cr components of two images

I have 2 images taken from two different cameras and I have to associate an object in both images. I have separated RGB ycbcr components and calculated the histogram of each component separately from both images
Then I concatenated histograms of all components into one vector.
I have already normalized each histogram separately so that sum(h)=1;
But when I have concatenated all histograms sum of that vector= 6.
and
when I applied Bhattacharya distance on both vectors the result is in range 4 and 5.
I cannot understand the similarity results because as per my knowledge result of Bhattacharya distance is 0-1
Please help
the best Bhattacharya distance is 2;
it is Jeffreys-Matusita distance that measure of Battachaya distance
if you have 2 class and the Jeffreys-Matusita was near 2 its good for classification and if it war near 0 the classes are same

U-matrix and self organizing maps

I am trying to understand SOMs. I am confused about when people post images representing
the image of data gotten my using SOM to map data to the map space. It is said that the U-matrix is used. But we have a finite grid of neurons so how do you get a "continous" image ?
For example starting with a 40x40 grid there are 1600 neurons. Now compute U-matrix but how do you plot these numbers now to get visualization ?
Links:
SOM tutorial with visualization
SOM from Wikipedia
The U-matrix stands for unified distance and contains in each cell the euclidean distance (in the input space) between neighboring cells. Small values in this matrix mean that SOM nodes are close together in the input space, whereas larger values mean that SOM nodes are far apart, even if they are close in the output space. As such, the U-matrix can be seen as summary of the probability density function of the input matrix in a 2D space. Usually, those distance values are discretized, color-coded based on intensity and displayed as a kind of heatmap.
Quoting the Matlab SOM toolbox,
Compute and return the unified distance matrix of a SOM.
For example a case of 5x1 -sized map:
m(1) m(2) m(3) m(4) m(5)
where m(i) denotes one map unit. The u-matrix is a 9x1 vector:
u(1) u(1,2) u(2) u(2,3) u(3) u(3,4) u(4) u(4,5) u(5)
where u(i,j) is the distance between map units m(i) and m(j)
and u(k) is the mean (or minimum, maximum or median) of the
surrounding values, e.g. u(3) = (u(2,3) + u(3,4))/2.
Apart from the SOM toolbox, you may have a look at the kohonen R package (see help(plot.kohonen) and use type="dist.neighbours").

Resources