Difficulty summing results from cypher - neo4j

I have a query
start ko=node:koid('ko:"ko:K01963"')
match p=(ko)-[r:abundance*1]-(n:Taxon)
with p, extract(nn IN relationships(p) | nn.`c1.mean`) as extracted
return extracted;
I would like to sum the values in extracted by using return sum(extracted), however, this throws me the following error
SyntaxException: Type mismatch: extracted already defined with conflicting type Collection<Boolean>, Collection<Number>, Collection<String> or Collection<Collection<Any>> (expected Relationship)`
Also, when i return extracted, my values are enclosed in square brackets
+---------------+
| extracted |
+---------------+
| [258.98813] |
| [0.0] |
| [0.0] |
| [0.8965624] |
| [0.85604626] |
| [0.0] |
Any idea how I can solve this. That is to sum the whole column which is returned.

Use reduce which is a fold operation:
return p, reduce(sum=0,nn IN relationships(p) | sum + nn.`c1.mean`) as sum
Values in square brackets are collections/arrays.

First off, given your use of "WITH" and labels, I'm going to assume you're using Cypher 2.x.
Also, to be honest, it's not entirely clear what you're after here, so I'm making some assumptions and stating them.
Second off, some parts of the query are unnecessary. As well, the *1 in your relationship means that there will only be one "hop" in your path. I don't know if that's what you're after so I'm going to make an assumption that you want to possibly go several levels deep (but we'll cap it so as to not kill your Neo4j instance; alternatively, you could use something like "allShortestPaths" but we won't go into that). This assumption can easily be changed by removing the cap. and signifying a single hop.
As for your the results being returned in brackets, extract returns a list of values, potentially only a single one.
So let's rewrite the query a little (note that for ko the identifier is a little confusing above so replace it with whatever you need to).
If we assume that you just want the sum per path, we can do:
MATCH p=(ko:koid {ko: 'K01963'})-[r:abundance*1..5]-(n:Taxon)
WITH reduce(val = 0, nn IN relationships(p) | val + nn.`c1.mean`) as summed
RETURN summed;
(This can also be modified to sum over all paths, I believe.)
If we want the total sum of ALL relationships returned, we need something a bit different, and it's even simpler (assuming in this case you really only do have:
MATCH p=(ko:koid {ko: 'K01963'})-[r:abundance]-(n:Taxon)
RETURN sum(r.`c1.mean`);
Hopefully even if I'm off in my assumptions and how I've read things this will at least get you thinking in the right way.
Mostly, the idea of using paths when you only have 1 hop to make in a path is a bit confusing, but perhaps this will help a little.

Related

F# - Cleanest way to Extract/Unwrap an Expected Case's Typed Value from Discriminated Union?

The overall type structure and utilization in my current F# is working very well. However, I want to get some perspective if I am doing something incorrectly or following some kind of anti-pattern. I do find myself very often essentially expecting a particular type in particular logic that is pulling from a more general type that is a Discriminated Union unifying a bunch of distinct types that all follow layers of common processing.
Essentially I need particular versions of this function:
'GeneralDiscriminatedUnionType -> 'SpecificCaseType
I find myself repeating many statements like the following:
let checkPromptUpdated (PromptUpdated prompt) = prompt
This is the simplest way that I've found to this; however, every one of these has a valid compiler warning that makes sense that there could be a problem if the function is called with a different type than the expected. This is fair, but I so far have like 40 to 50 of these.
So I started trying the following out, which is actually better, because it would raise a valid exception with incorrect usage (both are the same):
let checkPromptUpdated input = match input with | PromptUpdated prompt -> prompt | _ -> invalidOp "Expecting Prompt"
let checkPromptUpdated = function | PromptUpdated prompt -> prompt | _ -> invalidOp "Expecting Prompt"
However, this looks a lot messier and I'm trying to find out if anyone has any suggestions prior to me doing this messiness all over.
Is there some way to apply this wider logic to a more general function that could then allow me to write this 50 to 100x in a cleaner and more direct and readable way?
This question is just a matter of trying to write cleaner code.
This is an example of a DU that I'm trying to write functions for to be able to pull the particular typed values from the cases:
type StateEvent =
| PromptUpdated of Prompt
| CorrectAnswerUpdated of CorrectAnswer
| DifficultyUpdated of Difficulty
| TagsUpdated of Tag list
| NotesUpdated of Notes
| AuthorUpdated of Author
If the checkPromptUpdated function only works on events that are of the PromptUpdated case, then I think the best design is that the function should be taking just a value of type Prompt (instead of a value of type StateEvent) as an argument:
let checkPromptUpdated prompt =
// do whatever checks you need using 'prompt'
Of course, this means that the pattern matching will get moved from this function to a function that calls it - or further - to a place where you actually receive StateEvent and need to handle all the other cases too. But that is exactly what you want - once you pattern match on the event, you can work with the more specific types like Prompt.
This works for me
let (TypeUWantToExtractFrom unwrappedValue) = wrappedValue

Difference between discriminated Union types in F#

I'm reading about F# and looking at people's source code and I sometimes see
Type test =
| typeone
| typetwo
And sometimes i see
type test = typeone | typetwo
One of them has a pipe before and the one doesn't. At first I thought one was an enum vs discriminated Union but I THINK they are the same. Can someone explain the difference if there is any?
There is no difference. These notations are completely equivalent. The leading pipe character is optional.
Having this first pipe optional helps make the code look nicer in different circumstances. In particular, if my type has many cases, and each case has a lot of data, it makes sense to put them on separate lines. In this case, the leading pipe makes them look visually aligned, so that the reader perceives them as a single logical unit:
type Large =
| Case1 of int * string
| Case2 of bool
| SomeOtherCase
| FinalCase of SomeOtherType
On the other hand, if I only need two-three cases, I can put them on one line. In that case, the leading pipe only gets in the way, creating a feeling of clutter:
type QuickNSmall = One | Two | Three
There is no difference.
In the spec, the first | is optional.
The relevant bit of the spec is this:
union-type-cases:= '|'opt union-type-case '|' ... '|'
union-type-case
An enum would needs to give explicit values to the cases like
Type test =
| typeone = 1
| typetwo = 2
As already mentioned, the leading | is optional.
The examples in the other answers do not show this, so it is worth adding that you can omit it even for a multi-line discriminated union (and include it when defining a single line union):
type Large =
Case1 of int * string
| Case2 of bool
| SomeOtherCase
| FinalCase of SomeOtherType
type QuickNSmall = | One | Two | Three
I think most people just find these ugly (myself included!) and so they are usually written the way you see in the other answers.

Can I duplicate rows with kiba using a transform?

I'm currently using your gem to transform a csv that was webscraped from a personel-database that has no api.
From the scraping I ended up with a csv. I can process it pretty fine using your gem, there's only one bit I am wondering
Consider the following data:
====================================
| name | article_1 | article_2 |
------------------------------------
| Andy | foo | bar |
====================================
I can turn this into this:
======================
| name | article |
----------------------
| Andy | foo |
----------------------
| Andy | bar |
======================
(I used this tutorial to do this: http://thibautbarrere.com/2015/06/25/how-to-explode-multivalued-attributes-with-kiba/)
I'm using the normalizelogic on my loader for this. The code looks like:
source RowNormalizer, NormalizeArticles, CsvSource, 'RP00119.csv'
transform AddColumnEntiteit, :entiteit, "ocmw"
What I am wondering, can I achieve the same using a transform? So that the code would look like this:
source CsvSource, 'RP00119.csv'
transform NormalizeArticles
transform AddColumnEntiteit, :entiteit, "ocmw"
So question is: can I achieve to duplicate a row with a transform class?
EDIT: Kiba 2 supports exactly what you need. Check out the release notes.
In Kiba as currently released, a transform cannot yet more than one row - it's either one or zero.
The Kiba Pro offering I'm building includes a multithreaded runner which happens (by a side-effect rather than as actual goal) to allow transforms to yield an arbitrary number of rows, which is what you are looking after.
But that said, without Kiba Pro, here are a number of techniques which could help.
The first possibility is to split your ETL script into 2. Essentially you would cut it at the step where you want to normalize the articles, and put a destination here instead. Then in your second ETL script, you would use a source able to explode the row into many. This is I think what I'd recommend in your case.
If you do that, you can use either a simple Rake task to invoke the ETL scripts as a sequence, or you can alternatively use post_process to invoke the next one if you prefer (I prefer the first approach because it makes it easier to run either one or another).
Another approach (but too complicated for your current scenario) would be to declare the same source N times, but only yield a given subset of data, e.g.:
pre_process do
field_count = number_of_exploded_columns # extract from CSV?
end
(0..field_count).each do |shard|
source MySource, shard: shard, shard_count: field_count
end
then inside MySource you would only conditionnally yield like this:
yield row if row_index % field_count == shard
That's the 2 patterns I would think of!
I would definitely recommend the first one to get started though, more easy.

Is it possible to parameterise the NUnit test case display name when using ``Ticked method names``?

I am testing out F# and using NUnit as my test library; I have discovered the use of double-back ticks to allow arbitrary method naming to make my method names even more human readable.
I was wondering, whether rightly or wrongly, if it is possible to parameterise the method names when using NUnit's TestCaseAttribute to change the method name, for example:
[<TestCase("1", 1)>]
[<TestCase("2", 2)>]
let ``Should return #expected when "#input" is supplied`` input expected =
...
This might not be exactly what you need, but if you want to go beyond unit testing, then TickSpec (a BDD framework using F#) has a nice feature where it lets you write parameterized scenarios based on back-tick methods that contain regular expressions as place holders.
For example, in Phil Trelford's blog post, he uses this to define tic-tac-toe scenario:
Scenario: Winning positions
Given a board layout:
| 1 | 2 | 3 |
| O | O | X |
| O | | |
| X | | X |
When a player marks X at <row> <col>
Then X wins
Examples:
| row | col |
| middle | right |
| middle | middle |
| bottom | middle |
The method that implements the When clause of the scenario is defined in F# using something like this:
let [<When>] ``a player marks (X|O) at (top|middle|bottom) (left|middle|right)``
(mark:string,row:Row,col:Col) =
let y = int row
let x = int col
Debug.Assert(System.String.IsNullOrEmpty(layout.[y].[x]))
layout.[y].[x] <- mark
This is a neat thing, but it might be an overkill if you just want to write a simple parameterized unit test - BDD is useful if you want to produce human readable specifications of different scenarios (and there are actually other people reading them!)
This is not possible.
The basic issue is that for every input and expected you need to create a unique function. You would then need to pick the correct function to call (or your stacktrace wouldn't make sense). As a result this is not possible.
Having said that if you hacked around with something like eval (which must exist inside fsi), it might be possible to create something like this, but it would be very slow.

fitnesse: howto load a symbol with a result in query table

In a FitNesse query table, is it possible to load a symbol with the returned results?
example of what I would like to do:
|Query: GetPlayers|
| name | age | ID |
| jones | 36 | $ID1= |
| smith | 27 | $ID2= |
Or alternatively, just have one $ID symbol which is loaded with a collection.
Is this possible?
Unfortunately I believe this is still an unresolved issue in FitNesse. There is a PivotalTracker entry for it, that no one has take one yet: https://www.pivotaltracker.com/story/show/1893214. I've looked at it, but haven't been able to solve it myself.
We currently work around this by having a driver that can do equivalent query. Then we get the value back from the query. It is much more cumbersome, but works for now.
I completely agree that this should be possible. But as far as I know, it has not been fixed yet.
Maybe I don't understand your problem, but this is working fine for me:
|Query: whatever|whatever_param |
|key |value |
|a_key |$symbol= |
|check |$symbol|a_value|
I use Cslim, and the method whatever.query() returns a list of maps that correspond to the keys (the key a_key have the value a_value for this exemple)

Resources