Image Vectorizer [closed] - image-processing

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm looking for a library/tool/image processing technique which can create vectors out of images (similar to text vectorization like TFIDF or so). Can anybody share some ideas how to proceed?

I am not sure what programming language you are using. Below is a sample i am using in R
This is how to use the Pixmap library to read in an image as a matrix.
library(pixmap)
the next command may only work on Linux
system("convert foo.tiff foo.ppm")
img <- read.pnm("foo.ppm")
To get info on your new object:
str(img)
Although included in the previous output, the size of the image can be extracted by:
img#size
Then to extract the red channel from the image for the first ten rows:
myextract <- img#red[1:10,]
Or to extract the entire red channel to an actual matrix:
red.mat<-matrix(NA,img#size[1],img#size[2])
red.mat<-img#red
Refer this : how to convert a JPEG to an image matrix in R
You can use Python- numpy also
>>> arr = np.array(im)
>>> arr = np.arange(150).reshape(5, 10, 3)
>>> x, y, z = arr.shape
>>> indices = np.vstack(np.unravel_index(np.arange(x*y), (y, x))).T
#or indices = np.hstack((np.repeat(np.arange(y), x)[:,np.newaxis], np.tile(np.arange(x), y)[:,np.newaxis]))
>>> np.hstack((arr.reshape(x*y, z), indices))
array([[ 0, 1, 2, 0, 0],
[ 3, 4, 5, 0, 1],
[ 6, 7, 8, 0, 2],
[ 9, 10, 11, 0, 3],
[ 12, 13, 14, 0, 4],
[ 15, 16, 17, 1, 0],
[ 18, 19, 20, 1, 1],
[ 21, 22, 23, 1, 2],
[ 24, 25, 26, 1, 3],
[ 27, 28, 29, 1, 4],
[ 30, 31, 32, 2, 0],
[ 33, 34, 35, 2, 1],
[ 36, 37, 38, 2, 2],
...
[129, 130, 131, 8, 3],
[132, 133, 134, 8, 4],
[135, 136, 137, 9, 0],
[138, 139, 140, 9, 1],
[141, 142, 143, 9, 2],
[144, 145, 146, 9, 3],
[147, 148, 149, 9, 4]])
Where arr = np.array(im) is my image

Related

Implementing WeightedRandomSampler on imbalanced data set: RuntimeError: invalid multinomial distribution

I am trying to implement a weighted sampler for a very imbalanced data set. There are 182 different classes. Here is an array of the bin counts per class:
array([69487, 5770, 5753, 138, 4308, 10, 1161, 29, 5611,
350, 7, 183, 218, 4, 3, 3872, 5, 950,
33, 3, 443, 16, 20, 330, 4353, 186, 19,
122, 546, 6, 44, 6, 3561, 2186, 3, 48,
8440, 338, 9, 610, 74, 236, 160, 449, 72,
6, 37, 1729, 2255, 1392, 12, 1, 3426, 513,
44, 3, 28, 12, 9, 27, 5, 75, 15,
3, 21, 549, 7, 25, 871, 240, 128, 28,
253, 62, 55, 12, 8, 57, 16, 99, 6,
5, 150, 7, 110, 8, 2, 1296, 70, 1927,
470, 1, 1, 511, 2, 620, 946, 36, 19,
21, 39, 6, 101, 15, 7, 1, 90, 29,
40, 14, 1, 4, 330, 1099, 1248, 1146, 7414,
934, 156, 80, 755, 3, 6, 6, 9, 21,
70, 219, 3, 3, 15, 15, 12, 69, 21,
15, 3, 101, 9, 9, 11, 6, 32, 6,
32, 4422, 16282, 12408, 2959, 3352, 146, 1329, 1300,
3795, 90, 1109, 120, 48, 23, 9, 1, 6,
2, 1, 11, 5, 27, 3, 7, 1, 3,
70, 1598, 254, 90, 20, 120, 380, 230, 180,
10, 10])
In some classes, instances are as low as 1. I am trying to implement a Weighted random sampler from torch for this dataset. However, as the class imbalance is so large, when I calculate weights using
count_occr = np.bincount(dataset.y)
lbl_weights = 1. / count_occr
weights = np.array(lbl_weights)
weights = torch.from_numpy(weights)
sampler = WeightedRandomSampler(weights.type('torch.DoubleTensor'), len(weights*2))
I get two error messages:
RuntimeWarning: divide by zero encountered in true_divide
and
RuntimeError: invalid multinomial distribution (encountering probability entry = infinity or NaN)
Does anyone have a work around for this ? I was considering multiplying the lbl_weights by some scalar however I am not sure if this is a viable option.

How would I find the mode (stats) of pixel values of an image?

I'm using opencv and I'm able to get a pixel of an image-- a 3-dimensional tuple, via the code below. However, I'm not quite sure how to calculate the mode of the pixels values in the image.
import cv2
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
import cv2
img =cv2.imread('C:\\Users\Moondra\ABEO.png')
#px = img[100,100] #gets pixel value
#print (px)
I tried,
from scipy import stats
stats.mode(img)[0]
But this returns an array shape of
stats.mode(img)[0].shape
(1, 800, 3)
Not sure how exactly stats is calculating the dimensions from which to choose the mode, but I'm looking for each pixel value (3 dimensional tuple) to be one element.
EDIT:
For clarity, I'm going to lay out exactly what I'm looking for.
Let's say we have an array that is of shape (3,5,3) and looks like this
array([[[1, 1, 2], #[1,1,2] = represents the RGB values
[2, 2, 2],
[1, 2, 2],
[2, 1, 1],
[1, 2, 2]],
[[1, 2, 2],
[2, 2, 2],
[2, 2, 2],
[1, 2, 2],
[1, 2, 1]],
[[2, 2, 1],
[2, 2, 1],
[1, 1, 2],
[2, 1, 2],
[1, 1, 2]]])
I would then convert it to an array that looks like this for easier calculation
Turn this into
array([[1, 1, 2],
[2, 2, 2],
[1, 2, 2],
[2, 1, 1],
[1, 2, 2],
[1, 2, 2],
[2, 2, 2],
[2, 2, 2],
[1, 2, 2],
[1, 2, 1],
[2, 2, 1],
[2, 2, 1],
[1, 1, 2],
[2, 1, 2],
[1, 1, 2]])
which is of shape(15,3)
I would like to calculate the mode by counting each set of RGB as follows:
[1,1,2] = 3
[2,2,2] = 4
[1,2,2] = 4
[2,1,1] = 2
[1,1,2] =1
Thank you.
From the description, it seems you are after the pixel that's occurring the most in the input image. To solve for the same, here's one efficient approach using the concept of views -
def get_row_view(a):
void_dt = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[-1])))
a = np.ascontiguousarray(a)
return a.reshape(-1, a.shape[-1]).view(void_dt).ravel()
def get_mode(img):
unq, idx, count = np.unique(get_row_view(img), return_index=1, return_counts=1)
return img.reshape(-1,img.shape[-1])[idx[count.argmax()]]
We can also make use of np.unique with its axis argument, like so -
def get_mode(img):
unq,count = np.unique(img.reshape(-1,img.shape[-1]), axis=0, return_counts=True)
return unq[count.argmax()]
Sample run -
In [69]: img = np.random.randint(0,255,(4,5,3))
In [70]: img.reshape(-1,3)[np.random.choice(20,10,replace=0)] = 120
In [71]: img
Out[71]:
array([[[120, 120, 120],
[ 79, 105, 218],
[ 16, 55, 239],
[120, 120, 120],
[239, 95, 209]],
[[241, 18, 221],
[202, 185, 142],
[ 7, 47, 161],
[120, 120, 120],
[120, 120, 120]],
[[120, 120, 120],
[ 62, 41, 157],
[120, 120, 120],
[120, 120, 120],
[120, 120, 120]],
[[120, 120, 120],
[ 0, 107, 34],
[ 9, 83, 183],
[120, 120, 120],
[ 43, 121, 154]]])
In [74]: get_mode(img)
Out[74]: array([120, 120, 120])

How can I getting value of 8 neighbor of a image as the third dimension in Numpy

Given an 2d image data, for every pixel P1, how can I get the following 3d array out of it?
P9 P2 P3
P8 P1 P4
P7 P6 P5
img[x,y,:] = [P2, P3, P4, P5, P6, P7, P8, P9, P2]
without using forloop, just numpy operation (because of performance issue)
Here's one approach with zeros padding for boundary elements and using NumPy strides with the built-in scikit-image's view_as_windows for efficient sliding window extraction -
from skimage.util import view_as_windows as viewW
def patches(a, patch_shape):
side_size = patch_shape
ext_size = (side_size[0]-1)//2, (side_size[1]-1)//2
img = np.pad(a, ([ext_size[0]],[ext_size[1]]), 'constant', constant_values=(0))
return viewW(img, patch_shape)
Sample run -
In [98]: a = np.random.randint(0,255,(5,6))
In [99]: a
Out[99]:
array([[139, 176, 141, 172, 192, 81],
[163, 115, 7, 234, 72, 156],
[ 75, 60, 9, 81, 132, 12],
[106, 202, 158, 199, 128, 238],
[161, 33, 211, 233, 151, 52]])
In [100]: out = patches(a, [3,3]) # window size = [3,3]
In [101]: out.shape
Out[101]: (5, 6, 3, 3)
In [102]: out[0,0]
Out[102]:
array([[ 0, 0, 0],
[ 0, 139, 176],
[ 0, 163, 115]])
In [103]: out[0,1]
Out[103]:
array([[ 0, 0, 0],
[139, 176, 141],
[163, 115, 7]])
In [104]: out[-1,-1]
Out[104]:
array([[128, 238, 0],
[151, 52, 0],
[ 0, 0, 0]])
If you want a 3D array, you could add a reshape at the end, like so -
out.reshape(a.shape + (9,))
But, be mindful that this would create a copy instead of the efficient strided based views we would get from the function itself.

Improve precision algorithm to detect facial expression using LBP

I'm developping a simple algorithm to detect several facial expressions (happiness, sadness, anger...). I'm based on this paper to do that. I'm preprocessing before to apply LBP uniform operator dividing the normalized image into 6x6 regions as shown in the example below:
By applying uniform LBP 59 feats are extracted for each region, so finally I have 2124 feats by image (6x6x59). I think it's a too large number of feats when I have about 700 images to train a model. I have read that's not good to get a good precission. My question is how can I reduce the dimension of the feats or another technique to improve the precision of the algorithm.
A straightforward way to reduce feature dimensionality - and increase robustness at the same time - would be using rotation-invariant uniform patterns. For a circular neighbourhood of radius and formed by pixels, the texture descriptor represents each region through 10 features. Thus dimensionality is reduced from 2124 to 6 × 6 × 10 = 360.
PCA can help to reduce the size of descriptor without loosing important information. Just google "opencv pca example".
Another helpful thing is to add rotation invariance to your uniform lbp features. This will improve the precision as well as dramatically decrease size of descriptor from 59 to 10.
static cv::Mat rotate_table = (cv::Mat_<uchar>(1, 256) <<
0, 1, 1, 3, 1, 5, 3, 7, 1, 9, 5, 11, 3, 13, 7, 15, 1, 17, 9, 19, 5, 21, 11, 23,
3, 25, 13, 27, 7, 29, 15, 31, 1, 33, 17, 35, 9, 37, 19, 39, 5, 41, 21, 43, 11,
45, 23, 47, 3, 49, 25, 51, 13, 53, 27, 55, 7, 57, 29, 59, 15, 61, 31, 63, 1,
65, 33, 67, 17, 69, 35, 71, 9, 73, 37, 75, 19, 77, 39, 79, 5, 81, 41, 83, 21,
85, 43, 87, 11, 89, 45, 91, 23, 93, 47, 95, 3, 97, 49, 99, 25, 101, 51, 103,
13, 105, 53, 107, 27, 109, 55, 111, 7, 113, 57, 115, 29, 117, 59, 119, 15, 121,
61, 123, 31, 125, 63, 127, 1, 3, 65, 7, 33, 97, 67, 15, 17, 49, 69, 113, 35,
99, 71, 31, 9, 25, 73, 57, 37, 101, 75, 121, 19, 51, 77, 115, 39, 103, 79, 63,
5, 13, 81, 29, 41, 105, 83, 61, 21, 53, 85, 117, 43, 107, 87, 125, 11, 27, 89,
59, 45, 109, 91, 123, 23, 55, 93, 119, 47, 111, 95, 127, 3, 7, 97, 15, 49, 113,
99, 31, 25, 57, 101, 121, 51, 115, 103, 63, 13, 29, 105, 61, 53, 117, 107, 125,
27, 59, 109, 123, 55, 119, 111, 127, 7, 15, 113, 31, 57, 121, 115, 63, 29, 61,
117, 125, 59, 123, 119, 127, 15, 31, 121, 63, 61, 125, 123, 127, 31, 63, 125,
127, 63, 127, 127, 255
);
// the well known original uniform2 pattern
static cv::Mat uniform_table = (cv::Mat_<uchar>(1, 256) <<
0,1,2,3,4,58,5,6,7,58,58,58,8,58,9,10,11,58,58,58,58,58,58,58,12,58,58,58,13,58,
14,15,16,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,17,58,58,58,58,58,58,58,18,
58,58,58,19,58,20,21,22,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,
58,58,58,58,58,58,58,58,58,58,58,58,23,58,58,58,58,58,58,58,58,58,58,58,58,58,
58,58,24,58,58,58,58,58,58,58,25,58,58,58,26,58,27,28,29,30,58,31,58,58,58,32,58,
58,58,58,58,58,58,33,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,34,58,58,58,58,
58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,58,
58,35,36,37,58,38,58,58,58,39,58,58,58,58,58,58,58,40,58,58,58,58,58,58,58,58,58,
58,58,58,58,58,58,41,42,43,58,44,58,58,58,45,58,58,58,58,58,58,58,46,47,48,58,49,
58,58,58,50,51,52,58,53,54,55,56,57
);
static cv::Mat rotuni_table = (cv::Mat_<uchar>(1, 256) <<
0, 1, 1, 2, 1, 9, 2, 3, 1, 9, 9, 9, 2, 9, 3, 4, 1, 9, 9, 9, 9, 9, 9, 9, 2, 9, 9, 9,
3, 9, 4, 5, 1, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 2, 9, 9, 9, 9, 9, 9, 9,
3, 9, 9, 9, 4, 9, 5, 6, 1, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 2, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
3, 9, 9, 9, 9, 9, 9, 9, 4, 9, 9, 9, 5, 9, 6, 7, 1, 2, 9, 3, 9, 9, 9, 4, 9, 9, 9, 9,
9, 9, 9, 5, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 6, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 7, 2, 3, 9, 4,
9, 9, 9, 5, 9, 9, 9, 9, 9, 9, 9, 6, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 7,
3, 4, 9, 5, 9, 9, 9, 6, 9, 9, 9, 9, 9, 9, 9, 7, 4, 5, 9, 6, 9, 9, 9, 7, 5, 6, 9, 7,
6, 7, 7, 8
);
static void hist_patch_uniform(const Mat_<uchar> &fI, Mat &histo,
int histSize, bool norm, bool rotinv)
{
cv::Mat ufI, h, n;
if (rotinv) {
cv::Mat r8;
// rotation invariant transform
cv::LUT(fI, rotate_table, r8);
// uniformity for rotation invariant
cv::LUT(r8, rotuni_table, ufI);
// histSize is max 10 bins
} else {
cv::LUT(fI, uniform_table, ufI);
}
// the upper boundary is exclusive
float range[] = {0, (float)histSize};
const float *histRange = {range};
cv::calcHist(&ufI, 1, 0, Mat(), h, 1, &histSize, &histRange, true, false);
if (norm)
normalize(h, n);
else
n = h;
histo.push_back(n.reshape(1, 1));
}
The input is your CV_8U grey-scaled patch (one of those rects). The out is the rotation invariant, uniform, normalized reshaped histogram (1 line). Then you concat your patches histograms into the face descriptor. You will have 6*6*10 = 360. This is good by itself but with pca you can make it 300 or less without loosing important information and even improving the quality of detection because removed dimensions (let's say with variances less than 5%) not just occupy space but also contain mostly the noise (coming from, for example, gaussian noise from the sensor).
Then you can compare this concat histogram with the bank of faces or using svm (rbf kernel fits better). If you do it correctly, then predict for one face should not take more than 1-15ms (5 ms on my iphone7).
Hope this helps.

Sum of arrays of different size [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have an array, whose elements are arrays of different sizes, say:
[[45, 96, 0.0, 96, 96, 96, 0.0], [04, 55, 06, 55, 04, 04, 02, 55]]
I want to find the sum of the two arrays, i.e.,
[49, 151, ...]
You can use something like this:
a.flat_map{|x| x.in_groups_of(a.max_by(&:size).size, 0)}.transpose.map(&:sum)
Or this:
a.max_by(&:size).map.with_index{|_, i| a.sum{|x| x[i]||0}}
Not very pretty, but works:
>> a = [[45, 96, 0.0, 96, 96, 96, 0.0], [04, 55, 06, 55, 04, 04, 02, 55]]
=> [[45, 96, 0.0, 96, 96, 96, 0.0], [4, 55, 6, 55, 4, 4, 2, 55]]
>> sorted_a = a.sort_by(&:size).reverse
=> [[4, 55, 6, 55, 4, 4, 2, 55], [45, 96, 0.0, 96, 96, 96, 0.0]]
>> zipped_a = sorted_a.first.zip(sorted_a.last)
=> [[4, 45], [55, 96], [6, 0.0], [55, 96], [4, 96], [4, 96], [2, 0.0], [55, nil]]
>> zipped_a.map{ |arr| arr.map{ |v| v || 0 } }.map(&:sum)
=> [49, 151, 6.0, 151, 100, 100, 2.0, 55]
First you have to sort the array starting the longest for zip to work properly. Zipping will then create nil values in the redundant values of the shorter arrays. So the next step is to replace these nils to zeroes (using the nested map) and finally you can sum the values.
You can try this way also
k =[]
for i in 0..ar.max_by(&:size).length-1 do
k << ar.map { |x| [x[i]] }
end
k.map(&:flatten).map{|a| a.compact.sum}
=> [49, 151, 6.0, 151, 100, 100, 2.0, 55]
a = [[45, 96, 0, 96, 96, 96, 0],
[ 4, 55, 6, 55, 4, 4, 2, 55]]
Array.new(a.max_by(&:size).size) { |i| a.reduce(0) { |t,e| t+e[i].to_i } }
#=>[49, 151, 6, 151, 100, 100, 2, 55]
Note that nil.to_i #=> 0 (ref).
Another example:
a = [[1], [2,3,4], [5,6]]
Array.new(a.max_by(&:size).size) { |i| a.reduce(0) { |t,e| t+e[i].to_i } }
#=> [8,9,4]

Resources