Torch/Lua element wise multiplication of 2D and 1D tensors - lua

I'm trying to preform element wise multiplication between 2D batch tensor(128x512) and 1D tensor(512).
Currently, I'm doing it in this why:
nbatch = input:size(1)
for i = 1 , nbatch , 1 do
self.output[i]:cmul(self.noise)
end
It works and I get expected results, but I think it is not the best efficient why to do it.
Can it be done more efficiently?
How can I extend it for nD tensors element wise multiplied with (n-1)D tensors ?
Thanks!

self.output:cmul(self.noise:view(1, self.output:size(2)):expandAs(self.output))

Related

genereate unique row index in a 2D tensor as an output 1D tensor with PyTorch

When I implement target in in-batch multi-class classification on PyTorch (version 1.6), I have the following problem.
I got a variable D <class 'torch.Tensor'> (related to label description) of size as torch.Size([16, 128]), i.e. [data_size,token_id_size].
The original idea was to generate a target tensor of torch.Size([16]), each value is unique, corresponding to the rows in D, from 0 to 16 as [0,1,2,...,15], for in-batch multi-class classification.
This can be done using target = torch.LongTensor(torch.arange(16))
But there maybe repeated, non-unique rows in D, so I would like that the same, unique row in D has the its unique index in target. For example D has row0, row1, row8 the same token_ids or vector and the other rows are all different from each other, then target should be [0,0,2,3,4,5,6,0,8,9,10,11,12,13,14,15] or [0,0,1,2,3,4,5,0,6,7,8,9,10,11,12,13], wher the former has still indexes 0-15 (but no 1 and 7) and the latter has indexes of all in 0-13.
How can I implement this?
See answers of the simplified question (i) generate 1D tensor as unique index of rows of an 2D tensor and (ii) generate 1D tensor as unique index of rows of an 2D tensor (keeping the order and the original index), which address the problem of this question.
But these seem not useful to improve the contrastive multi-class classification.

Reshaping numpy array as an input to CNN

I have seen multiple posts on reshaping numpy arrays as inputs to CNN's however, I haven't been able to successfully reshape my array as an input to my CNN!
I have a CNN that merges with another model further downstream. The input shape of the CNN is (4,4,1) -- it is bigger but i have purposefully made it smaller to establish he pipeline and get it running before i put in the proper size.
the format will be the same however, its a 1 channel n x n np.array. I am getting errors when reshaping which I will mention after the code. The input dimensions are put in to the model as follows:
cnn_branch_input = tf.keras.layers.Input(shape=(4,4,1))
cnn_branch_two = tf.keras.layers.Conv2D(etc....)(cnn_branch_input)
the np array (which is originally a pandas dataframe) characteristics and reshaping are as follows:
np.array(array).shape
(4,4)
input = np.array(array).reshape(-1,1,4,4)
input.shape
(1,1,4,4)
the input to my merged model is as follows:
model.fit([cnn_input,gnn_input, gnn_node_feat], y,
#sample_weight=train_mask,
#validation_data=validation_data,
batch_size=4,
shuffle=False)
this causes an error which makes sense to me:
ValueError: Data cardinality is ambiguous:
x sizes: 1, 4, 4 -- Please provide data which shares the same first dimension.
So now when reshaping to intentionally have a 4x4 plus 1 channel shape as follows:
input = np.array(array).reshape(-1,4,4,1)
input.shape
(1,4,4,1)
Two things, the array reshapes to 4, 1x1 arrays, so it seems the structure of the original array is lost, and I get the same error!!
Notice that in both reshape methods, the shape is either (1,4,4,1) or (1,1,4,4).. the -1 entry simply becomes a 1, making the CNN think the first element is shape 1. I thought the -1 would allow me to successfully add the sample dimension as 'any number of samples'.
Simply entering the original (4,4) array, I receive the error that the CNN received a 2 dim array while a 4 dimension array is required.
Im really confused as to how to correctly reshape this array! I would appreciate any help!

classifier using boolean formula

I am trying to find a classifier that is represented by an arbitrary boolean formula. Is it possible to do so ? I tried using the SVC from sklearn.svm using the linear kernel, but not sure if it is correct and if it is, how to extract a formula from the learned classifier.
Here's a simple dataset with 4 variables x,y,z,w (features) and labels 0 and 1. And any data with x=1 or y=1 will have a label 1 and everything else has label 0.
x,y,z,w,label
0,0,0,0,0
0,0,0,1,0
0,0,1,0,1
0,0,1,1,1
0,1,0,0,0
0,1,0,1,0
0,1,1,0,1
0,1,1,1,1
1,0,0,0,1
1,0,0,1,1
1,0,1,0,1
1,0,1,1,1
1,1,0,0,1
1,1,0,1,1
1,1,1,0,1
1,1,1,1,1
For this example, I want to extract the classifier represented by the formula x=1 or z=1. Eventually I will have more complex data represented by complex, arbitrary formula (e.g., (x= 1 or y=0) and (z=0) ... )
The input->output relationships in your data is non-linear, discrete and non-smooth. Any linear models will perform badly in this case. Try instead a DecisionTreeClassifier, which should be OK for your kind of data.
Alternatively you could use a Boolean Satisfiability solver, but this will only work if your data is deterministic and not fuzzy.

Torch 'Gather' Issue

I have two tensors as follows:
Normalised Tensor :
1
10
94
[torch.LongStorage of size 3]
and
Batch :
1
10
[torch.LongStorage of size 2]
I would like to use 'Batch' to select indices in the 3 dimension of 'Normalised Tensor'. So far I have used gather as follows:
normalised:long():gather(1, batch:long())
Unfortunately it's returning this error.
"bad argument #1 to 'gather' (Input tensor must have same dimensions as output"
Any help would be much appreciated! Thanks
Answer is based on assuming the following: you have a three dimensional tensor of sizes x,y,z and you want a three dimensional tensor of size x,y,10 where the x,y slices are chosen based on indices listed in another tensor of size 1,10.
I, personally, have spent much time pondering what would be the possible use of the gather method. Only conclusion I've come to is: it not the problem described above.
The described problem is solvable by use of the index function:
local slice = normalised:gather(3, batch[1]:long())

hierarchical clustering using flann in opencv

I'm trying to use a method hierarchicalClustering from opencv 2.4.2.
It work without error, but the problem is, that I don't undertstand the parametrs it accepts eg. branching...
And i think it couses my problem that i get always just one cluster.
My input is a cv::Mat of LBPH features (for face detection) number of rows is 12 and number of cols is 6272.
No matter what is the value of branching factor I get always just one cluster and its centroid is mean of rows from input matrix grouppeed_one_ferson_features.
Could you advice ???
THANK a LOT!!!
heres the code:
cv::Mat groupped_one_person_features;
.... // fill grouppeed_one_ferson_features with data
int Nclusters=50;
cv::Mat centroids (Nclusters,Features.data[0][0].cols,CV_32FC1);
int count = cv::flann::hierarchicalClustering<cvflann::L1<float>>groupped_one_person_features,centroids,cvflann::KMeansIndexParams(2000,11,cvflann::FLANN_CENTERS_KMEANSPP));
First of all, you missed a parenthesis in your last line:
int count = cv::flann::hierarchicalClustering<cvflann::L1<float>>(groupped_one_person_features,centroids,cvflann::KMeansIndexParams(2000,11,cvflann::FLANN_CENTERS_KMEANSPP));
In the order, the parameters are (according to flann_base.hpp):
The points to be clustered
The computed cluster centers. Matrix should be preallocated and centers.rows is the number of clusters requested.
The clustering parameters
The distance to be used for clustering
Therefore, if you always get one cluster, it possibly means that your centroids matrix only has one row. Can you verify this?
The parameters of KMeansIndexParams are (according to kmeans_index.h):
branching factor: the number of children of a node in the tree
iterations: max iterations to perform in one kmeans clustering (kmeans tree)
centers_init: algorithm used for picking the initial cluster centers for kmeans tree
cb_index: cluster boundary index. Used when searching the kmeans tree

Resources