Datastax Enterprise Graph GroupCount Order - datastax-enterprise

I have multiple documents
I have multiple users
Users read documents
I want to get documents who read by user Stefan.
g.V().
has('user','name','Stefan').
out('read').
hasLabel('document')
What other users has read the same document. And what other documents are those users reading wich the user Stefan doesn't have:
g.V().
has('user','name','Stefan').
out('read').
hasLabel('document').
in('read').
has('user','name',neq('Stefan')).
out('read').
match(
__.as('d').hasLabel('document'),
__.not(__.as('d').hasLabel('document').in('read').has('user','name','Stefan'))
).
select('d').valueMap('docId','title')
Now i want to sort this by number of user what read the document with groupCount.
g.V().
has('user','name','Stefan').
out('read').
hasLabel('document').
in('read').
has('user','name',neq('Stefan')).
out('read').
match(
__.as('d').hasLabel('document'),
__.not(__.as('d').hasLabel('document').in('read').has('user','name','Stefan'))
).
select('d').valueMap('docId','title').
groupCount().by().
order(local).by(values,decr)
This will work, but the result is not what i want:
{
"{docId=[33975], title=[Doc1 - 5 - 1]}": 3,
"{docId=[33379], title=[Doc2 - 5]}": 2,
"{docId=[32474], title=[Doc3 - 5]}": 2,
"{docId=[31150], title=[Doc4 2-2013]}": 1,
"{docId=[107944], title=[Doc5]}": 1
}
The key is folded. I want to have 3 columns in my result:
DocId
Title
GroupCount value.
How can i do this?

Related

Filter based on other filters in Google App Script

I would like to see/view all my contacts based on the same company name, whenever I filter my worksheet.
For example, My table:
If I choose to filter Contact Name: "Adi" I would like to see this:
Because Adi and Dan belong to the same Company. another example could be, If I choose to filter
Last Modified field: " 3/05/2020", The result should be:
Again because Adi and Dan belong to the same Company. The solution could be on app script as well.
try:
=FILTER(A:C, REGEXMATCH(A:A, VLOOKUP(G1, {B:B, A:A}, 2, 0)))
=A2=VLOOKUP(G$1, {B:B, A:A}, 2, 0)

How to count cypher labels with specific condition?

I have a graph database with information about different companies and their subsidiaries. Now my task is to display the structure of the company. This I have achieved with d3 and vertical tree.
But additionally I have to write summary statistics about the company that is currently displayed. Companies can be chosen from a dropdown list which is fetching this data dynamically via AJAX call.
I have to write in the same HTML a short summary like :
Total amount of subsidiaries for CompanyA: 300
Companies in Corporate Havens : 45%
Companies in Tax havens 5%
My database consists of two nodes: Company and Country, and the country has label like CH and TH.
CREATE (:TH:Country{name:'Nauru', capital:'Yaren', lng:166.920867,lat:-0.5477})
WITH 1 as dummy MATCH (a:Company), (b:Country) WHERE a.name=‘CompanyA ' AND b.name='Netherlands' CREATE (a)-[:IS_REGISTERED]->(b)
So how can I find amount of subsidiaries of CompanyA that are registered in corporate and tax havens? And how to pass this info further to html
I found different cypher queries to query all the labels as well as apocalyptic.stats but this does not allow me to filter on mother company. I appreciate help.
The cypher is good because you write a query almost in natural language (the query below may be incorrect - did not check, but the idea is clear):
MATCH (motherCompany:Company {name: 'CompanyA'})-[:HAS_SUBSIDIARY]->(childCompany:Company)
WITH motherCompany,
childCompany
MATCH (childCompany)-[:IS_REGISTERED]->(country:Country)
WITH motherCompany,
collect(labels(country)) AS countriesLabels
WITH motherCompany,
countriesLabels,
size([countryLabels IN countriesLabels WHERE 'TH' IN countryLabels ]) AS inTaxHeaven
RETURN motherCompany,
size(countriesLabels) AS total,
inTaxHeaven,
size(countriesLabels) - inTaxHeaven AS inCorporateHeaven

Returning a count of and all complete paths in Neo4j [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
Being an absolute noob in neo4j and having had very generous help with a previous question, I thought I'd try my luck once again as I'm still struggling.
The example scenario is that of students that enters a house and walks from one room to another. The journey doesn't have to start or end at a particular room but the order of sequence that a student enters a room is important.
What I want to find out is all the complete paths that students have taken along with a count of how many times the path in question was taken. Below is the sample data and what I've tried (thanks to the answer of a previous question along with a series of blog posts):
the file dorm.csv
ID|SID|EID|ROOM|ENTERS|LEAVES
1|1|12|BLUE|1/01/2015 11:00|4/01/2015 10:19
2|2|18|GREEN|1/01/2015 12:11|1/01/2015 12:11
3|2|18|YELLOW|1/01/2015 12:11|1/01/2015 12:20
4|2|18|BLUE|1/01/2015 12:20|5/01/2015 10:48
5|3|28|GREEN|1/01/2015 18:41|1/01/2015 18:41
6|3|28|YELLOW|1/01/2015 18:41|1/01/2015 21:00
7|3|28|BLUE|1/01/2015 21:00|9/01/2015 9:30
8|4|36|BLUE|1/01/2015 19:30|3/01/2015 11:00
9|5|40|GREEN|2/01/2015 19:08|2/01/2015 19:08
10|5|40|ORANGE|2/01/2015 19:08|3/01/2015 2:43
11|5|40|PURPLE|3/01/2015 2:43|4/01/2015 16:44
12|6|48|GREEN|3/01/2015 11:52|3/01/2015 11:52
13|6|48|YELLOW|3/01/2015 11:52|3/01/2015 17:45
14|6|48|RED|3/01/2015 17:45|7/01/2015 10:00
creating nodes for Student, Room and Visit where Visit is the event of a student entering a room uniquely identified by the ID property
CREATE CONSTRAINT ON (student:Student) ASSERT student.studentID IS UNIQUE;
CREATE CONSTRAINT ON (room:Room) ASSERT room.roomID IS UNIQUE;
CREATE CONSTRAINT ON (visit:Visit) ASSERT visit.visitID IS UNIQUE;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///dorm.csv" as line fieldterminator '|'
MERGE (student:Student {studentID: line.SID})
MERGE (room:Room {roomID: line.ROOM})
MERGE (visit:Visit {visitID: line.ID, roomID: line.ROOM, studentID: line.SID, ticketID: line.EID})
create (student)-[:VERB]->(visit)-[:OBJECT]->(room)
Creating a PREV relationship allows the ordering or sequencing that the student travels in. This uses data in the file dormprev.csv. If a student has only visited a single room, this ID will not appear in the dormprev file as its purpose is to link/chain visits. Data as below
ID|PREV_ID|EID
3|2|18
4|3|18
6|5|28
7|6|28
10|9|40
11|10|40
13|12|48
14|13|48
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///dormprev.csv" as line fieldterminator '|'
MATCH (new:Visit {visitID: line.ID})
MATCH (old:Visit {visitID: line.PREV_ID})
MERGE (new)-[:PREV]->(old)
I can view all student journeys by the below query
MATCH (student:Student)-[:VERB]->(visit:Visit)-[:OBJECT]-(room:Room)
RETURN student, visit, room
However, I have no idea how to return all of the rooms in a complete path.
if I run this query
MATCH p = (:Visit)<-[:PREV]-(:Visit) return p
I can see that it, for example, for student ID 2 returns Green and Yellow and then Yellow and Blue as a separate pair - I want to view that as Green, Yellow, Blue
This also means that if I run the below query:
MATCH p = (:Visit)<-[:PREV]-(:Visit)
WITH p, EXTRACT(v IN NODES(p) | v.roomID) AS rooms
UNWIND rooms AS stays
WITH p, COUNT(DISTINCT stays) AS distinct_stays
WHERE distinct_stays = LENGTH(NODES(p))
RETURN EXTRACT(v in NODES(p) | v.roomID), count(p)
ORDER BY count(p) DESC
it will return a count of those pairings rather than count of "whole paths" if that makes sense.
For example, SID 2 and SID 3 both visit rooms GREEN, YELLOW, BLUE in that order. SID 5 visits GREEN, ORANGE, PURPLE in that order.
What I'm hoping to see is:
[GREEN, YELLOW, BLUE] 2
[GREEN, ORANGE, PURPLE] 1
etc. Is that possible with the above model and if so can anyone please help point me in the right direction? The number of rooms that are visited is not guaranteed and can be anything from one to *. However, if only one room is visited, that's not really of interest and so is the reason why I thought this model might make sense (again, stolen from a blog post series).
I don't know if the above makes sense but any help would be much appreciated - this makes for an excellent use case and would be really useful.
Thank you for your kind help.
What I think you are looking for is variable path length. And you can accomplish that by merely changing this in your query (note the asterisk) :
MATCH p = (:Visit)<-[:PREV*]-(:Visit)
Do allow me a couple of further remarks. Yes, I understand the convenience of having roomID and studentID in the Visit node (keeps this specific query quite a bit simpler), but you are ignoring the whole point of having relationships in the first place (in fact, if you do it this way there's currently actually no point in having the Student and Room nodes at all) and you are going to have trouble maintaining them. Secondly ... if we are going to be splitting the proverbial 3rd normal form hairs ;-), then the relations for a Visit should actually be created as follows (note the direction of the relationships) :
CREATE (student)-[:VERB]->(visit)<-[:OBJECT]-(room)
Other than that I must say you're moving very fast :-)
Hope this helps, Tom
Building a bit on Tom's suggestions, you might consider an alternate model doing away with :Visit nodes completely, and making your relationship types a bit more focused, like this:
(:Student)-[:VISITED]->(:Room)
You can set entered and left properties on the :VISITED relationship, which will allow you to order the relationships (and corresponding :Rooms) in visited order.
Here's an alternate import that will do this, using APOC Procedures (you'll have to install the correct version corresponding with your Neo4j version) to parse out timestamps from your date strings.
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///dorm.csv" as line fieldterminator '|'
MERGE (student:Student {studentID: line.SID})
MERGE (room:Room {roomID: line.ROOM})
WITH student, room, apoc.date.parse(line.ENTERS, 'ms', 'MM/dd/yyyy HH:mm') as entered, apoc.date.parse(line.LEAVES, 'ms', 'MM/dd/yyyy HH:mm') as left
CREATE (student)-[r:VISITED]->(room)
SET r.entered = entered, r.left = left
And now your query to get all paths and the number of students who have taken those paths becomes very easy:
MATCH (s:Student)-[v:VISITED]->(r:Room)
WHERE size((s)-[:VISITED]->()) > 1
WITH s, r
ORDER BY v.entered ASC
WITH s, collect(r.roomID) as rooms
RETURN rooms, count(s)

Umbraco - Search all fields of all documents which are children of a list of parents

Our Umbraco 7 site has the following 4 top level nodes.
1 - Home 1 (Language 1)
2 - Home 2 (Language 2)
3 - Discuss (Bilingual, i.e. Language 1 and Language 2)
4 - Buy (Bilingual, i.e. Language 1 and Language 2)
Depending on whether the user is on a Language 1 or Language 2 page, I would like to search all of the fields of all of the documents of all children of 1, 3, 4 (if the current page is Language 1) or 2, 3, 4 (if the current page is Language 2).
Up until now I have been using a very basic search, where the user simply enters a value "query" into a text box:
IEnumerable<IPublishedContent> c = Umbraco.TypedSearch(query);
This would be ideal, apart from that it scans all of the documents (i.e. the root and children of 1, 2, 3, and 4), and does not exclude documents which are children of 1 or 2, depending on the language.
Is believe that I need to set up an Examine Search Provider and an Examine Index for Language 1 and Language 2, but I'm not sure how to set up multiple IndexParentId values, nor, how to scan all of the fields in all documents.
Could anyone point me in the right direction?
I realise that scanning all the fields may not always be a good idea, however we currently have many different fieldsnames for the sections containing "content" in our doc types, so right now this is the best approach for me.
(I originally posted this on the Umbraco forum a week ago, but haven't had a response, hence my post on here)
You could make your query filter by path.
Say Language 1 root node ID is 100 and Language 2 root node ID is 200. Then if you're on Language 1, query pages where path does not contain 200 and vice versa.
Something like this: http://www.attackmonkey.co.uk/blog/2011/12/limiting-an-examine-search-to-the-current-site

Search four fields in YQL geo.places

We are currently using YQL to query geo data for towns and counties in the UK. At the moment, we can use the following query to find all towns named Boston:
select * from geo.places where text="boston" and placeTypeName="Town"
Demo
The issue is, that we would like to specify the county and country to generate more specific results. I have tried the following query, but it returns 0 results:
select * from geo.places where (text="boston" and placeTypeName="Town") and (text="lincolnshire" and placeTypeName="County")
Demo
How can I query 3 field types to return the results I need? Essentially, we would like to query the following fields:
text and placeTypeName="Town"
text and placeTypeName="County"
text and placeTypeName="Country"
This may be an option maybe:
https://developer.yahoo.com/blogs/ydnsevenblog/solving-location-based-services-needs-yahoo-other-technology-7952.html
As it mentions:
Turning text into a location
You can also turn a text (name) into a location using the following code:
yqlgeo.get('paris,fr',function(o){
alert(o.place.name+' ('+
o.place.centroid.latitude+','+
o.place.centroid.longitude+
')');
})
This wrapper call uses our Placemaker Service under the hood and automatically disambiguates for you. This means that Paris is Paris, France, and not Paris Hilton; London is London, England, and not Jack London.

Resources