How to more accurately compare the characteristics between two images? - opencv

I have developed two methods using SIFT and ORB, but it seems to me that the points do not correspond correctly. Am I using these functions wrongly or do I need something different?
orb = cv2.ORB_create()
keypoints_X, descriptor_X = orb.detectAndCompute(car1_gray, None)
keypoints_y, descriptor_y = orb.detectAndCompute(car2_gray, None)
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck = True)
matches = bf.match(descriptor_X, descriptor_y)
matches = sorted(matches, key = lambda x: x.distance)
result = cv2.drawMatches(car1_gray, keypoints_X, car2_gray, keypoints_y, matches[:10], car2_gray, flags = 2)
sift = cv2.SIFT_create()
keypoints_X, descriptor_X = sift.detectAndCompute(car1_gray, None)
keypoints_y, descriptor_y = sift.detectAndCompute(car2_gray, None)
bf = cv2.BFMatcher()
matches = bf.knnMatch(descriptor_X, descriptor_y, k=2)
bom = []
for m,n in matches:
if m.distance < 0.75*n.distance:
bom.append([m])
result = cv2.drawMatchesKnn(car1_gray, keypoints_X, car2_gray, keypoints_y, bom, None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
Below the result of SIFT and ORB:

Take a look into SuperGlue, graph neural network based feature matching. Although, they do not provide training code, but two pretrained model for indoor, outdoor is available. Links,
https://github.com/magicleap/SuperGluePretrainedNetwork
https://psarlin.com/superglue/
https://arxiv.org/pdf/1911.11763.pdf

Related

Image Deconvolution

to practice Wiener deconvolution, I'm trying to perform a simple deconvolution:
def div(img1 ,img2):
res = np.zeros(img2.shape, dtype = 'complex_')
for i in range (img2.shape[0]):
for j in range (img2.shape[0]):
if (np.abs(img2[i][j]) > 0.001):
res[i][j] = 1 / (img2[i][j])
else:
res[i][j] = 0.001
return res
filtre = np.asarray([[1,1,1],
[1,1,1],
[1,1,1]]) * 1/9
filtre_freq = fft2(filtre)
v = signal.convolve(img, filtre)
F = div(1,(filtre_freq))
f = ifft2(F)
res = signal.convolve(v, f)
I am trying to compute the inverse filter in the frequency domain, pass it to the spatial domain and do the convolution with the inverse filter. On paper it's pretty simple, even if I have to manage the divisions by 0 without really knowing how to do it.
But my results seem really inconsistent:
If anyone can enlighten me on this ... Thanks in advance and have a great evening.

ERROR: MethodError: no method matching DocumentTermMatrix(::Vector{String})

I am trying to train a basic SVM model for multiclass text classification in Julia. My dataset has around 75K rows and 2 columns (text and label). The context of the dataset is the abstracts of scientific papers gathered from PubMed. I have 10 labels in the dataset.
The dataset looks like this:
I keep receiving two different Method errors. The starting one is:
ERROR: MethodError: no method matching DocumentTermMatrix(::Vector{String})
I have tried:
convert(Array,data[:,:text])
and also:
convert(Matrix,data[:,:text])
Array conversion gives the same error, and matrix conversion gives:
ERROR: MethodError: no method matching (Matrix)(::Vector{String})
My code is:
using DataFrames, CSV, StatsBase,Printf, LIBSVM, TextAnalysis, Random
function ReadData(data)
df = CSV.read(data, DataFrame)
return df
end
function splitdf(df, pct)
#assert 0 <= pct <= 1
ids = collect(axes(df, 1))
shuffle!(ids)
sel = ids .<= nrow(df) .* pct
return view(df,sel, :), view(df, .!sel, :)
end
function Feature_Extract(data)
Text = convert(Array,data[:,:text])
m = DocumentTermMatrix(Text)
X = tfidf(m)
return X
end
function Classify(data)
data = ReadData(data)
train, test = splitdf(data, 0.5)
ytrain = train.label
ytest = test.label
Xtrain = Feature_Extract(train)
Xtest = Feature_Extract(test)
model = svmtrain(Xtrain, ytrain)
ŷ, decision_values = svmpredict(model, Xtest);
#printf "Accuracy: %.2f%%\n" mean(ŷ .== ytest) * 100
end
data = "data/composite_data.csv"
#time Classify(data)
I appreciate your help to solve this problem.
EDIT:
I have managed to get the corpus but now facing DimensionMismatch Error:
using DataFrames, CSV, StatsBase,Printf, LIBSVM, TextAnalysis, Random
function ReadData(data)
df = CSV.read(data, DataFrame)
#count = countmap(df.label)
#println(count)
#amt,lesslabel = findmin(count)
#println(amt, lesslabel)
#println(first(df,5))
return df
end
function splitdf(df, pct)
#assert 0 <= pct <= 1
ids = collect(axes(df, 1))
shuffle!(ids)
sel = ids .<= nrow(df) .* pct
return view(df,sel, :), view(df, .!sel, :)
end
function Feature_Extract(data)
crps = Corpus(StringDocument.(data.text))
update_lexicon!(crps)
m = DocumentTermMatrix(crps)
X = tf_idf(m)
return X
end
function Classify(data)
data = ReadData(data)
#println(labels)
#println(first(instances))
train, test = splitdf(data, 0.5)
ytrain = train.label
ytest = test.label
Xtrain = Feature_Extract(train)
Xtest = Feature_Extract(test)
model = svmtrain(Xtrain, ytrain)
ŷ, decision_values = svmpredict(model, Xtest);
#printf "Accuracy: %.2f%%\n" mean(ŷ .== ytest) * 100
end
data = "data/composite_data.csv"
#time Classify(data)
Error:
ERROR: DimensionMismatch("Size of second dimension of training instance\n matrix (247317) does not match length of\n labels (38263)")
(Copying Bogumił Kamiński's solution from the comments, as a community wiki answer, for better visibility.)
The argument to DocumentTermMatrix should be of type Corpus, as in this example.
A Corpus can be created with:
Corpus(StringDocument.(data.text))
There's a DimensionMismatch error after that, which is due to the mismatch between what tf_idf sends and what svmtrain expects. tf_idf's return value has one row per document, whereas svmtrain expects one column per document i.e. expects each column to be an X value. So, performing a permutedims on the result before passing it to svmtrain resolves this mismatch.

Keras LSTM from for loop, using functional API with custom number of layers

I am trying to build a network through the keras functional API feeding two lists containing the number of units of the LSTM layers and of the FC (Dense) layers. I want to analyse 20 consecutive segments (batches) which contain fs time steps each and 2 values (2 features per time step). This is my code:
Rec = [4,4,4]
FC = [8,4,2,1]
def keras_LSTM(Rec,FC,fs, n_witness, lr=0.04, optimizer='Adam'):
model_LSTM = Input(batch_shape=(20,fs,n_witness))
return_state_bool=True
for i in range(shape(Rec)[0]):
nRec = Rec[i]
if i == shape(Rec)[0]-1:
return_state_bool=False
model_LSTM = LSTM(nRec, return_sequences=True,return_state=return_state_bool,
stateful=True, input_shape=(None,n_witness),
name='LSTM'+str(i))(model_LSTM)
for j in range(shape(FC)[0]):
nFC = FC[j]
model_LSTM = Dense(nFC)(model_LSTM)
model_LSTM = LeakyReLU(alpha=0.01)(model_LSTM)
nFC_final = 1
model_LSTM = Dense(nFC_final)(model_LSTM)
predictions = LeakyReLU(alpha=0.01)(model_LSTM)
full_model_LSTM = Model(inputs=model_LSTM, outputs=predictions)
model_LSTM.compile(optimizer=keras.optimizers.Adam(lr=lr, beta_1=0.9, beta_2=0.999,
epsilon=1e-8, decay=0.066667, amsgrad=False), loss='mean_squared_error')
return full_model_LSTM
model_new = keras_LSTM(Rec, FC, fs=fs, n_witness=n_wit)
model_new.summary()
When compiling I get the following error:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(20, 2048, 2), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []
Which I actually don't quite understand, but suspect it may have something to do with inputs?
I solved the issue by modifying line 4 of the code as in the following:
x = model_LSTM = Input(batch_shape=(20,fs,n_witness))
along with line 21, as in the following:
full_model_LSTM = Model(inputs=x, outputs=predictions)

For deep learning, With activation relu the output becomes NAN during training while is normal with tanh

The neural network I trained is the critic network for deep reinforcement learning. The problem is when one of the layer's activation is set to be relu or elu, the output would be nan after some training step, while the output is normal if the activation is tanh. And the code is as follows(based on tensorflow):
with tf.variable_scope('critic'):
self.batch_size = tf.shape(self.tfs)[0]
l_out_x = denseWN(x=self.tfs, name='l3', num_units=self.cell_size, nonlinearity=tf.nn.tanh, trainable=True,shape=[det*step*2, self.cell_size])
l_out_x1 = denseWN(x=l_out_x, name='l3_1', num_units=32, trainable=True,nonlinearity=tf.nn.tanh, shape=[self.cell_size, 32])
l_out_x2 = denseWN(x=l_out_x1, name='l3_2', num_units=32, trainable=True,nonlinearity=tf.nn.tanh,shape=[32, 32])
l_out_x3 = denseWN(x=l_out_x2, name='l3_3', num_units=32, trainable=True,shape=[32, 32])
self.v = denseWN(x=l_out_x3, name='l4', num_units=1, trainable=True, shape=[32, 1])
Here is the code for basic layer construction:
def get_var_maybe_avg(var_name, ema, trainable, shape):
if var_name=='V':
initializer = tf.contrib.layers.xavier_initializer()
v = tf.get_variable(name=var_name, initializer=initializer, trainable=trainable, shape=shape)
if var_name=='g':
initializer = tf.constant_initializer(1.0)
v = tf.get_variable(name=var_name, initializer=initializer, trainable=trainable, shape=[shape[-1]])
if var_name=='b':
initializer = tf.constant_initializer(0.1)
v = tf.get_variable(name=var_name, initializer=initializer, trainable=trainable, shape=[shape[-1]])
if ema is not None:
v = ema.average(v)
return v
def get_vars_maybe_avg(var_names, ema, trainable, shape):
vars=[]
for vn in var_names:
vars.append(get_var_maybe_avg(vn, ema, trainable=trainable, shape=shape))
return vars
def denseWN(x, name, num_units, trainable, shape, nonlinearity=None, ema=None, **kwargs):
with tf.variable_scope(name):
V, g, b = get_vars_maybe_avg(['V', 'g', 'b'], ema, trainable=trainable, shape=shape)
x = tf.matmul(x, V)
scaler = g/tf.sqrt(tf.reduce_sum(tf.square(V),[0]))
x = tf.reshape(scaler,[1,num_units])*x + tf.reshape(b,[1,num_units])
if nonlinearity is not None:
x = nonlinearity(x)
return x
Here is the code to train the network:
self.tfdc_r = tf.placeholder(tf.float32, [None, 1], 'discounted_r')
self.advantage = self.tfdc_r - self.v
l1_regularizer = tf.contrib.layers.l1_regularizer(scale=0.005, scope=None)
self.weights = tf.trainable_variables()
regularization_penalty_critic = tf.contrib.layers.apply_regularization(l1_regularizer, self.weights)
self.closs = tf.reduce_mean(tf.square(self.advantage))
self.optimizer = tf.train.RMSPropOptimizer(0.0001, 0.99, 0.0, 1e-6)
self.grads_and_vars = self.optimizer.compute_gradients(self.closs)
self.grads_and_vars = [[tf.clip_by_norm(grad,5), var] for grad, var in self.grads_and_vars if grad is not None]
self.ctrain_op = self.optimizer.apply_gradients(self.grads_and_vars, global_step=tf.contrib.framework.get_global_step())
Looks like you're facing the problem of exploding gradients with ReLu activation function (that what NaN means -- very big activations). There are several techniques to deal with this issue, e.g. batch normalization (changes the network architecture) or a delicate variable initialization (that's what I'd try first).
You are using Xavier initialization for V variables in different layers, which indeed works fine for logistic sigmoid activation (see the paper by Xavier Glorot and Yoshua Bengio), or, in other words, tanh.
The preferred initialization strategy for the ReLU activation function (and its variants, including ELU) is He initialization. In tensorflow it's implemented via tf.variance_scaling_initializer:
initializer = tf.variance_scaling_initializer()
v = tf.get_variable(name=var_name, initializer=initializer, ...)
You might also want to try smaller values for b and g variables, but it's hard to say the exact value just by looking at your model. If nothing helps, consider adding batch-norm layers to your model to control activation distribution.

How can I change the max sequence length in a Tensorflow RNN Model?

I am currently trying to adapt my tensorflow classifier, which is able to tag a sequence of words to be positive or negative, to handle much longer sequences, without retraining. My model is a RNN, with a max sequence lenght of 210. One input is one word(300 dim), I vectorised the words with Googles word2vec, so I am able to feed a sequence with max 210 words. Now my question is, how can I change the max sequence length to for example 3000, for classifying movie reviews.
My working model with fixed max sequence length of 210(tf_version: 1.1.0):
n_chunks = 210
chunk_size = 300
x = tf.placeholder("float",[None,n_chunks,chunk_size])
y = tf.placeholder("float",None)
seq_length = tf.placeholder("int64",None)
with tf.variable_scope("rnn1"):
lstm_cell = tf.contrib.rnn.LSTMCell(rnn_size,
state_is_tuple=True)
lstm_cell = tf.contrib.rnn.DropoutWrapper (lstm_cell,
input_keep_prob=0.8)
outputs, _ = tf.nn.dynamic_rnn(lstm_cell,x,dtype=tf.float32,
sequence_length = self.seq_length)
fc = tf.contrib.layers.fully_connected(outputs, 1000,
activation_fn=tf.nn.relu)
output = tf.contrib.layers.flatten(fc)
#*1
logits = tf.contrib.layers.fully_connected(output, self.n_classes,
activation_fn=None)
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits
(logits=logits, labels=y) )
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
...
#train
#train_x padded to fit(batch_size*n_chunks*chunk_size)
sess.run([optimizer, cost], feed_dict={x:train_x, y:train_y,
seq_length:seq_length})
#predict:
...
pred = tf.nn.softmax(logits)
pred = sess.run(pred,feed_dict={x:word_vecs, seq_length:sq_l})
What modifications I already tried:
1 Replacing n_chunks with None and simply feed data in
x = tf.placeholder(tf.float32, [None,None,300])
#model fails to build
#ValueError: The last dimension of the inputs to `Dense` should be defined.
#Found `None`.
# at *1
...
#all entrys in word_vecs still have got the same length for example
#3000(batch_size*3000(!= n_chunks)*300)
pred = tf.nn.softmax(logits)
pred = sess.run(pred,feed_dict={x:word_vecs, seq_length:sq_l})
2 Changing x and then restore the old model:
x = tf.placeholder(tf.float32, [None,n_chunks*10,chunk_size]
...
saver = tf.train.Saver(tf.all_variables(), reshape=True)
saver.restore(sess,"...")
#fails as well:
#InvalidArgumentError (see above for traceback): Input to reshape is a
#tensor with 420000 values, but the requested shape has 840000
#[[Node: save/Reshape_5 = Reshape[T=DT_FLOAT, Tshape=DT_INT32,
#_device="/job:localhost/replica:0/task:0/cpu:0"](save/RestoreV2_5,
#save/Reshape_5/shape)]]
# run prediction
If it is possible could you please provide me with any working example or explain me why it isnt?
I am just wondering why not you just assign the n_chunk a value of 3000?
In your first attempt, you cannot use two None, since tf cannot how many dimensions to put for each one. The first dimension is set as None because it is contingent upon the batch size. In your second attempt, you just change one place and the other places where n_chunks is used may conflict with the x placeholder.

Resources