Hey I am trying to calculate inverse of a value that I got from an API call. I have my tasks like these.
Tasks
parse [type="jsonparse" path="quotes,USDEUR" data="$(fetch)"]
inverse [type="divide" input="1" divisor="$(parse)" precision="5"]
multiply [type="multiply" input="$(inverse)" times=1000000000000000000]
But I am getting the response like this.
Response
parse jsonparse
1.022015
path: quotes,USDEUR
data: $(fetch)
inverse divide
"1"
input: 1
divisor: $(parse)
precision: 5
multiply multiply
"1000000000000000000"
input: $(inverse)
times: 1000000000000000000
What I need it to get inverse of the value getting from parse task. But when I do this I am just getting it as 1.
Related
I am trying to understand the difference or error I am receiving between these two steps. I followed this tutorial to practice KNN with my own data (https://towardsdatascience.com/create-a-similarity-graph-from-node-properties-with-neo4j-2d26bb9d829e)
During the process we project our graph of interest, which mine contains three properties: bd_load, weight, and length of organisms. In the example we use this code below to create scaledProperties embeddings between the 3 variables.
Project graph
//(5) project graph of interest
CALL gds.graph.project('bd_graph',
'node_sim',
'*',
{nodeProperties:['bd_load', 'weight', 'length']})
Scale variables of interest between 0-1 for future Euclidean distance calculation
//(6) add scalar 0-1
CALL gds.alpha.scaleProperties.mutate('bd_graph',
{nodeProperties:['bd_load', 'weight', 'length'],
scaler:'MinMax',
mutateProperty:'scaledProperties'})
YIELD nodePropertiesWritten
We then can run KNN based on euclidean distance
//(8) project relationship to graph
CALL gds.knn.mutate("bd_graph",
{nodeProperties: {scaledProperties: "EUCLIDEAN"},
topK: 15,
mutateRelationshipType: "IS_SIMILAR",
mutateProperty: "similarity",
similarityCutoff: 0.6409912109375,
sampleRate:1,
randomSeed:42,
concurrency:1}
)
However I continue the learning curve with Neo4j and FastRP I am trying to understand the difference between the scale property and FastRP. Today I tried to create graph embeddings for my 3 variables using FastRP with 8 dimensions on my projected graph with out running the scaled property embeddings. My thought was increasing the dimensions would be better for finding similarities between nodes. The code below runs fine and there is an embedding vector with 8 elements.
FastRP
CALL gds.fastRP.mutate(
'bd_graph',
{
embeddingDimension: 8,
mutateProperty: 'fastrp-embedding',
featureProperties: ['bd_load', 'weight', 'length']
}
)
YIELD nodePropertiesWritten
But when I run the below code
ALL gds.knn.stats("bd_graph",
{
nodeProperties:{fastrp-embedding:"EUCLIDEAN"},
topK:10,
sampleRate:1,
randomSeed:42,
concurrency:1
}
) YIELD similarityDistribution
RETURN similarityDistribution
I receive an error:
Invalid input '{': expected "+" or "-" (line 4, column 22 (offset: 97))
nodeProperties:{fastrp-embedding:"EUCLIDEAN"},
Does the embedding element length have to match the number of variables in the node? Am using FastRP correctly and my understanding of creating embeddings with in nodes to then calculate Euclidean distance for a similarity score?
I am glad you are finding the tutorial helpful and getting into GDS!
Map keys in Cypher must be strings. https://neo4j.com/docs/cypher-manual/current/syntax/maps/
The - in your property name fastrp-embedding is not recognized as a string character. If you enclose that property name with back ticks, GDS will know to treat the special character as part of the map key. This should work for you.
CALL gds.knn.stats("bd_graph",
{
nodeProperties:{`fastrp-embedding`:"EUCLIDEAN"},
topK:10,
sampleRate:1,
randomSeed:42,
concurrency:1
}
) YIELD similarityDistribution
RETURN similarityDistribution
The recommended format for Neo4j property names is camel case. If you name your property fastrpEmbedding instead of fastrp-embedding, you would not need to use the back ticks.
I am using the function cv.matchTemplate to try to find template matches.
result = cv.matchTemplate(img, templ, match_method)
After I run the function I have a bunch of answers in list result. I want to filter the list to find the best n matches. The data in result just a large array of numbers so I don't know what criteria to filter based on. Using extremes = cv.minMaxLoc(result, None) filters the result list in an undesired way before converting them to locations.
The match_method is cv.TM_SQDIFF. I want to:
filter the results down to the best matches
Use the results to obtain the locations
How can I acheive this?
You can treshold the result of matchTemplate to find locations with sufficient match. This tutorial should get you started. Read at the bottom of the page for finding multiple matches.
import numpy as np
threshold = 0.2
loc = np.where( result <= threshold) # filter the results
for pt in zip(*loc[::-1]): #pt marks the location of the match
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
Keep in mind depending on the function you use will determine how you filter. cv.TM_SQDIFF tends to zero as the match quality increases so setting the threshold closer to zero filters out worse. The opposite is true for cv.TM CCORR cv.TM_CCORR_NORMED cv.TM_COEFF and cv.TM_COEFF_NORMED matching methods (better tends to 1)
The above answer does not find the best N matches as the question asked. It filters out answers based on a threshold leaving open the (likely) possibility that you still have more than N results or zero results that beat the threshold.
To find the N 'best matches' we're looking for the N highest numbers in a 2d array and retrieving their indexes so we know the location. We can use nump.argpartition to find the highest N indexes in a 1d array and numpy.ndarray.flatten with numpy.unravel_index to go back and forth between a 2d and 1d array like so:
find_num = 5
result = cv.matchTemplate(img, templ, match_method)
idx_1d = np.argpartition(result.flatten(), -find_num)[-find_num:]
idx_2d = np.unravel_index(idx_1d, result.shape)
From here you have the x,y locations of the top 5 matches.
I have a distributed dask array with shape (2400,2400) with chunksize (100,100). I thought I could use topk(-n) to find the smallest n values. However, it appears to return an array of shape (2400,n), so it looks like it finds the smallest n in each row.Is there a way to use topk to get the smallest n values across all rows (entire array)?
One idea is to call topk twice, once for each axis.
>>> dist
dask.array<pow, shape=(2400, 2400), dtype=float64, chunksize=(100, 100)>
>>> dist.topk(-5,axis=0).topk(-5,axis=1).compute()
array([[ 0. , 2620.09503644, 2842.15200157, 2955.08409356,
3163.49458669],
[3660.67698657, 3670.4457495 , 3700.09837707, 3717.09052889,
4002.86497399],
[4125.89820524, 4139.44658137, 4250.50420539, 4331.01304547,
4402.14606754],
[4328.22966119, 4378.25193428, 4507.94409903, 4522.4913488 ,
4555.06860541],
[4441.58755402, 4560.95625938, 4576.39333974, 4682.06215251,
4765.11531865]])
One idea is to call topk twice, once for each axis.
Sounds good to me!
You might consider flattening the array first, but I can't see an advantage to this to what you've already found.
x.flatten().topk(...)
I have a dataset that determines whether a student will be admitted given two scores. I train my model with this data and can determine if a student will be admitted or not using the following code:
model.predict([score1, score2])
This results in the answer:
[1]
How can I get the probability of that? If I try predict_proba, I get:
model.predict_proba([score1, score2])
>>[[ 0.38537034 0.61462966]]
I'd really like to see something like:
>> [0.75]
to indicate that P(admittance | score1, score2) = 0.75
You may notice that 0.38537034+ 0.61462966 = 1. This is because you are getting the probabilities for both classes (admitted and not admitted) from the output of predict_proba. If you had 7 classes, you would instead get something like [[p1, p2, p3, p4, p5, p6, p7]] where p1+p2+p3+p4+p5+p6+p7 = 1 and pi >= 0. So if you want the probability of output i, you go index into your result and look at what pi is. Thats just how that works.
So if you had something where the probability was 0.75 of being not admitted, you would get a result that looks like [[0.25, 0.75]].
(I may have reversed the ordering you used in your code for admitted/not admitted, but it doesn't matter - that just changes the index you look at).
If you want to sklearn's Lr model and you want to get the 2 classes' predicted probability, you should use this:
model.predict_proba(xtest)
You will get the array of two classes prob(shape N*2).
I'm trying to go through the procedure of Speech Synthesis via AR model, or LPC synthesis, IIR all-pole filter model, what ever you call it.
The main idea is to get the auto-correlation(AR) coefficient and estimate error, then use the AR coefficients to filter the estimated error, we can get the reconstructed signal.
**MATLAB CODE**
data = [1 2 1 3 5 1 2 5];
% auto correlation coefficients
a = lpc(data, 4);
% estimated signal
est = filter([0 -a(2:end)],1,data);
% estimated error
e = data - est;
% reconstructed signal
rec = filter(1,a,e);
You will see that rec == data exactly.
Now comes my question.
I'm trying to convert the model into Latices implementation. After looking up the Matlab reference, it turned out that I should use
tf2latc
to convert the transfer function into lattice implementation and
latcfilt
to use the lattice to filter the data.
Simply repeating the procedure above just doesn't work.
So I'm looking for help in the following aspects:
1) Example on using the tr2latc and latcfilt function to perform a complete procedure of building the filter.
2) Example on using a lattice implementation to perform a voice reconstruction.
Thx
Well, finally I got the answer.
From a transfer function, we can get the lattice implementation coefficients. Then filter it with latcfilt.
a = [1 3 1 4 4];
[k v] = tf2latc(1,a)
x = [1 2 1 3 4 1 5];
filter(1,a,x)
latcfilt(k,v,x)
Then you can see that the two filter gives the same result.