Graph database - how to reference Nodes from within Relationship - neo4j

I'm preparing a graph database (using neo4j) to handle the kind of social network scenario:
Users can Post to their walls, sharing the Posts with either specific users
Users can send Messages to others
A Message can either be a text or "link" to the Post
So I came up with the following idea:
Users and Posts are the Nodes of the graph. When the user A creates a post P sharing it with both B and C, the following relationships are created: A-[:posted]->p and p-[:shared_with]->B and p-[:shared_with]->C. The Post data (id, text, date) are stored as properties of the :posted relation.
For messages it's similar: A-[:messaged]->C for example.
Now, if I want to share the post in a message, I include the post_id in :messaged properties. It allows me to pull all the messages (together with the posts linked) with a single Cypher query:
match (a:User) match (b) where a.username = "C" match a-[m:messaged]-b
optional match (x:User)-[p:posted]->(post:Post)
where post.id = m.post_id
return distinct
m.post_id,
startnode(m).username as from,
endnode(m).username as to ,
x.username as PostAuthor,
case x when null then "Message" else "Pulled Post" end as Type,
case x when null then m.text else post.text end as Text,
m.date
order by m.date asc
It doesn't look right to me though - since on the graph there's no visual connection between the Post and the message. But, I can't set a relation between Node and Relation, right? How should I do it in order to have it designed properly?

In a model where post and message are both a node, your query would look like this:
match (a:User)<-[:FROM]-(m:Message)-[:TO]->(b:User)
where a.username = "C"
match (m)<-[:COMMENT]-(post:Post)<-[:POSTED]-(x:User)
return distinct
m.id,a as from, b as to,
x.username as PostAuthor,
case x when null then "Message" else "Pulled Post" end as Type,
case x when null then m.text else post.text end as Text,
m.date
order by m.date asc

Related

How to limit overall union in cypher

I have these nodes:
user{user_id}: users
thread{thread_id, post_date} : posts
tag_id{tag_id}: the tag of the post
And these relationships:
(user) - [: FOLLOWED] -> (tag) // the user follows the tag
(thread) - [: BELONG_TO] -> (tag) // the post belongs to tag
(user) - [: READ{read_date}] -> (thread) // user reads the post
(user) - [: BEING_REPLIED{post_date}] -> (thread) // the user is given a reply by another user to his / her comment in a post
(user) - [: BEING_MENTIONED{post_date}] -> (thread) // the user is given a mention by another user comment in a post
I want to get 10 posts that the user is replied or mentioned by another user, then to the posts that belong to tag the user follows but the user has not read to display in each user's feed, I use multiple unions in the query but cannot limit to the total, the resulting form is limited to the last union
I wrote cypher as follows:
MATCH (u:User {user_id:3})-[rp:BEING_REPLIED]->(th:Thread)<-[r:READ]-(u:User {user_id:3})
WHERE rp.post_date> r.read_date
return u.user_id as user_id,th.thread_id as thread_id,
duration.inDays(datetime(),datetime(rp.post_date)).days*10 + 1000000 AS point
UNION ALL
MATCH (u:User {user_id:3})-[m:BEING_MENTIONED]->(th:Thread)<-[r:READ]-(u:User {user_id:3})
WHERE m.post_date> r.read_date
return u.user_id as user_id,th.thread_id as thread_id,
duration.inDays(datetime(),datetime(m.post_date)).days*10 + 1000000 AS point
UNION ALL
MATCH (u:User {user_id:3})-[m:BEING_MENTIONED]->(th:Thread)
WHERE NOT EXISTS ((u)-[:READ]->(th))
return u.user_id as user_id,th.thread_id as thread_id,
duration.inDays(datetime(),datetime(m.post_date)).days*10 + 1000000 AS point
MATCH (u:User)-[:FOLLOWED]->(t:Tag)<-[:BELONG_TO]->(th)
WHERE u.user_id = 3 AND NOT EXISTS((u)-[]->(th))
WITH u.user_id AS user_id, th.thread_id AS thread_id,
(0.5*th.like_count + 0.3*th.comment_count + 0.005*th.view_count
+ duration.inDays(datetime(),datetime(th.published_date)).days*100) AS point
ORDER BY point desc
RETURN DISTINCT user_id, thread_id, point
UNION
MATCH (u:User)-[:FOLLOWED]->(t:Tag)<-[:BELONG_TO]->(th)
WHERE u.user_id = 3 AND NOT EXISTS((u)-[]->(th))
AND NOT th.rating_total IS NULL
WITH u.user_id AS user_id, th.thread_id AS thread_id,
(duration.inDays(datetime(),datetime(th.published_date)).days*150 + 30*th.rating_total) AS point
ORDER BY point desc, th.published_date desc
RETURN DISTINCT user_id, thread_id, point
LIMIT 10
How can i set this query limit overall?
Thanks for your help!
You need subqueries for this, you should be using Neo4j 4.0.x or later, this allows you to perform post-UNION processing
Usage of UNION ALL in the subquery, with the LIMIT 10 outside of it, should allow you to get what you want.

How to UNION tables and make results accessible in a Ruby view

I'm quite new to RoR and creating a student project for a course I'm taking. I'm wanting to construct a type of query we didn't cover in the course and which I know I could do in a snap in .NET and SQL. I'm having a heck of a time though getting it implemented the Ruby way.
What I'd like to do: Display a list on a user's page of all "posts" by that user's friends.
"Posts" are found in both a questions table and in a blurbs table that users contribute to. I'd like to UNION these two into a single recordset to sort by updated_at DESC.
The table column names are not the same however, and this is my sticking point since other successful answers I've seen have hinged on column names being the same between the two.
In SQL I'd write something like (emphasis on like):
SELECT b.Blurb AS 'UserPost', b.updated_at, u.username as 'Author'
FROM Blurbs b
INNER JOIN Users u ON b.User_ID = u.ID
WHERE u.ID IN
(SELECT f.friend_id FROM Friendships f WHERE f.User_ID = [current user])
ORDER BY b.updated_at DESC
UNION
SELECT q.Question, q.updated_at, u.username
FROM Questions q
INNER JOIN Users u ON q.User_ID = u.ID
WHERE u.ID IN
(SELECT f.friend_id FROM Friendships f WHERE f.User_ID = [current user])
ORDER BY b.updated_at DESC
The User model's (applicable) relationships are:
has_many :friendships
has_many :friends, through: :friendships
has_many :questions
has_many :blurbs
And the Question and Blurb models both have belongs_to :user
In the view I'd like to display the contents of the 'UserPost' column and the 'Author'. I'm sure this is possible, I'm just too new still to ActiveRecord and how statements are formed. Happy to have some input or review any relevant links that speak to this specifically!
Final Solution
Hopefully this will assist others in the future with Ruby UNION questions. Thanks to #Plamena's input the final implementation ended up as:
def friend_posts
sql = "...the UNION statement seen above..."
ActiveRecord::Base.connection.select_all(ActiveRecord::Base.send("sanitize_sql_array",[sql, self.id, self.id] ) )
end
Currently Active Record lacks union support. You can use SQL:
sql = <<-SQL
# your sql query goes here
SELECT b.created_at ...
UNION(
SELECT q.created_at
....
)
SQL
posts = ActiveRecord::Base.connection.select_all(sql)
Then you can iterate the result:
posts.each do |post|
# post is a hash
p post['created_at']
end
Your best way to do this is to just use the power of Rails
If you want all of something belonging to a user's friend:
current_user.friends.find(id_of_friend).first.questions
This would get all of the questions from a certain friend.
Now, it seems that you have writings in multiple places (this is hard to visualise without your providing a model of how writings is connected to everywhere else). Can you provide this?
#blurbs = Blurb.includes(:user)
#blurbs.each do |blurb|
p blurb.blurb, blurb.user.username
end

postgresql get list of unique list with order by another table column

Tables:
#leads
id | user_id |created_at | updated_at
#users
id | first_name
#todos
id | deadline_at | target_id
I want to get unique list of leads between two dates(deadline_at) with ordering by todos.deadline_at desc
I do:
SELECT distinct(leads.*), todos.deadline_at
FROM leads
INNER JOIN users ON users.id = leads.user_id
LEFT JOIN todos ON todos.target_id = leads.user_id
WHERE (todos.deadline_at between '2015-11-26T00:00:00+00:00' and '2015-11-26T23:59:59+00:00')
ORDER BY todos.deadline_at DESC;
This query returns right ordered list but with duplicates. If I use distinct or distinct on with leads.id, then postgresql requires me use it in order by - In that case I got wrong ordered list.
How do I can achive expected result?
Since you don't really need the users table.
Maybe try this?
Lead.joins("INNER JOIN todos ON leads.user_id = todos.target_id")
.where("leads.deadline_at" => (date_a..date_b))
.select("leads.*, todos.deadline_at")
.order("todos.deadline_at desc")
It seams that you're confusing the result of a sql table with joins and the same result after ActiveRecord treatment on an association.
I presume Lead has_many :todos, through: :user so you can do this :
Lead.eager_load(:todos).
where("leads.deadline_at" => (date_a..date_b)).
order("todos.deadline_at")
No need to apply distinct or whatever, ActiveRecord will sort out the leads from the todosand you'll have them in the right order with no duplicates. The raw sql result however will have plenty of duplicates.
If you want to achieve something similar in sql alone, you can use distinct or group by on leads.id, but then you'll lose all the todos it "contains". However you can use aggregate function to calculate/extract things on the "lost todo data".
For example :
Lead.joins(:todos).
group("leads.id").
select("leads.*, min(todos.deadline_at) as first_todo_deadline")
order("first_todo_deadline")
Notice that todos data is only available in the aggregate functions (min, count, avg, etc) since the todos are "compressed" if you wish in each lead!
Hope it makes sense.

how to find me and my followers posts in neo4j

I have user, post and follows relations like below
user1-[:FOLLOWS]-user2-[:POSTED]-post
How can I get all posts which are made by my followers and myself in a cypher query?
Assuming you can uniquely identify yourself by an ID:
MATCH (me:User {Id: 1})<-[:FOLLOWS*0..1]-(follower)-[:POSTED]->(post)
RETURN post;
Rationale: in the case where the length of the :FOLLOWS relationship is 0, me == follower, so the query returns your posts as well.
You can find an example here: http://console.neo4j.org/?id=dexd4p

Sequel -- How To Construct This Query?

I have a users table, which has a one-to-many relationship with a user_purchases table via the foreign key user_id. That is, each user can make many purchases (or may have none, in which case he will have no entries in the user_purchases table).
user_purchases has only one other field that is of interest here, which is purchase_date.
I am trying to write a Sequel ORM statement that will return a dataset with the following columns:
user_id
date of the users SECOND purchase, if it exists
So users who have not made at least 2 purchases will not appear in this dataset. What is the best way to write this Sequel statement?
Please note I am looking for a dataset with ALL users returned who have >= 2 purchases
Thanks!
EDIT FOR CLARITY
Here is a similar statement I wrote to get users and their first purchase date (as opposed to 2nd purchase date, which I am asking for help with in the current post):
DB[:users].join(:user_purchases, :user_id => :id)
.select{[:user_id, min(:purchase_date)]}
.group(:user_id)
You don't seem to be worried about the dates, just the counts so
DB[:user_purchases].group_and_count(:user_id).having(:count > 1).all
will return a list of user_ids and counts where the count (of purchases) is >= 2. Something like
[{:count=>2, :user_id=>1}, {:count=>7, :user_id=>2}, {:count=>2, :user_id=>3}, ...]
If you want to get the users with that, the easiest way with Sequel is probably to extract just the list of user_ids and feed that back into another query:
DB[:users].where(:id => DB[:user_purchases].group_and_count(:user_id).
having(:count > 1).all.map{|row| row[:user_id]}).all
Edit:
I felt like there should be a more succinct way and then I saw this answer (from Sequel author Jeremy Evans) to another question using select_group and select_more : https://stackoverflow.com/a/10886982/131226
This should do it without the subselect:
DB[:users].
left_join(:user_purchases, :user_id=>:id).
select_group(:id).
select_more{count(:purchase_date).as(:purchase_count)}.
having(:purchase_count > 1)
It generates this SQL
SELECT `id`, count(`purchase_date`) AS 'purchase_count'
FROM `users` LEFT JOIN `user_purchases`
ON (`user_purchases`.`user_id` = `users`.`id`)
GROUP BY `id` HAVING (`purchase_count` > 1)"
Generally, this could be the SQL query that you need:
SELECT u.id, up1.purchase_date FROM users u
LEFT JOIN user_purchases up1 ON u.id = up1.user_id
LEFT JOIN user_purchases up2 ON u.id = up2.user_id AND up2.purchase_date < up1.purchase_date
GROUP BY u.id, up1.purchase_date
HAVING COUNT(up2.purchase_date) = 1;
Try converting that to sequel, if you don't get any better answers.
The date of the user's second purchase would be the second row retrieved if you do an order_by(:purchase_date) as part of your query.
To access that, do a limit(2) to constrain the query to two results then take the [-1] (or last) one. So, if you're not using models and are working with datasets only, and know the user_id you're interested in, your (untested) query would be:
DB[:user_purchases].where(:user_id => user_id).order_by(:user_purchases__purchase_date).limit(2)[-1]
Here's some output from Sequel's console:
DB[:user_purchases].where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT * FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"
Add the appropriate select clause:
.select(:user_id, :purchase_date)
and you should be done:
DB[:user_purchases].select(:user_id, :purchase_date).where(:user_id => 1).order_by(:purchase_date).limit(2).sql
=> "SELECT user_id, purchase_date FROM user_purchases WHERE (user_id = 1) ORDER BY purchase_date LIMIT 2"

Resources