Erlang scalability questions related to gen_server:call() - erlang

in erlang otp when making a gen_server:call(), you have to send in the name of the node, to which you are making the call.
Lets say I have this usecase:
I have two nodes: 'node1' and 'node2' running. I can use those nodes to make gen_server:call() to each other.
Now lets say I added 2 more nodes: 'node3' and 'node4' and pinged each other so that all nodes can see and make gen_server:calls to each other.
How do the erlang pros handle the dynamic adding of new nodes like that so that they know the new node names to enter into the gen_server calls, or is it a requirement to know beforehand the names of all the nodes so that they are hardcoded in somewhere like the sys.config?

you can use:
erlang:nodes()
to get a "now" view of the node list
Also, you can use:
net_kernel:monitor_nodes(true) to be notified as nodes come and go (via ping / crash / etc)
To see if your module is running on that node you can either call the gen_server with some kind of ping callback
That, or you can use the rpc module to call erlang:whereis(name) on the foreign node.

Related

Creating Stateful nodes in Neo4j

When I say Stateful Node, I mean a node that carries ‘state info,’ such as the path that leads to this node. E.g. R1 is a node, and
state1: link coming from path 1
state2: link coming from path 2
Is there any way I could create such a node in Neo4j? While traversing such a node, I expect it to behave like this:
if state 1, and input is x, then [:has] node1
if state one and input is y, then stop
if state two and input is z, then [: has] node 2.
I want to convert node R1 to a stateful node so that it keeps the information mentioned above. Does Neo4J support such nodes? If so, could you guide me to a resource? Also, does the cipher query support the ‘stateful’ approach so I can set the state according to the path from which R1 is produced?
In the Neo4j architecture, a relationship is a doubly linked-list that stores pointers to the start and end nodes.
It sounds like what you're looking to do is create nodes that store that same information for all relationships that touch it, and then have behavior based on how the graph reaches them.
This is more akin to logic control, and Cypher handles that through filters on relationship type, node labels, and properties.
However, you can always set properties of nodes based on queries. For example:
MATCH (:AUTH_T)-[:HAS]->(n:R1)
SET R1.reached_by = "HAS"
Then you could do something with that in the future, like if you want to know if node n was reached by another method.

How does the Minimum spanning tree in neo4j work

I am playing around with some graph theory algorithms in neo4j. I am trying to find the minimum spanning tree (mst) within my network. I synthetically created a network of 10 000 people. Each person has 12 relationship types each one linking him back to the other 9999 and each relationship with its own weight assigned.
The problem I have however is the fact that according to the definition the results must be a tree spanning over the ENTIRE network. The neo4j function however only returns a very small sub-graph (only about 12 nodes) of the entire network.
The code I am using looks like this:
MATCH (a:Name {Name:"Dillon Snow"})
CALL algo.mst(a,"Weight",{stats:true})
YIELD loadMillis, computeMillis, writeMillis, weightSum, weightMin, weightMax, relationshipCount
RETURN loadMillis, computeMillis, writeMillis, weightSum, weightMin, weightMax, relationshipCount
What can I change to get the function to return the mst spreading through the entire network
algo.mst.* has not been adapted to the matured Neo4j-Graph-Algorithms-CoreAPI in its current release (3.2.5.2/3.3.0.0 # Dec 2017) which might lead to unexpected results. But there is a pull request in the pipe, you can expect some changes in the next release.
Anyway.. The procedure should add a new relationship-type (default mst) to your nodes. In a connected graph each node should be connected as well while a disconnected graph leads to connections only between the nodes of this particular connected component (from your startNode).
If i understand you right you have multiple relationship types and more then one of them between a pair of nodes? E.g. Node A is connected to Node B with several relations, each of them with a different type and property value. This is a problem. In general the Graph-Algorithms-API does not support multible releationships. Each pair of nodes can only have one connection per direction. Although you can import multible types the core-api itself has no idea of the underlying type. If multible relationships between a pair of nodes get imported usualy the last one wins. This has been mentioned in the documentation ;)
To overcome this limitation you could replace your relationship types with some kind of artificial nodes. When traversing over the result tree the occurence of one of those nodes would indicate the original relationship.

Chord Join DHT - join protocol for second node

I have a distributed hash table (DHT) which is running on multiple instances of the same program, either on multiple machines or for testing on different ports on the same machine. These instances are started after each other. First, the base node is started, then the other nodes join it.
I am a little bit troubled how I should implement the join of the second node, in a way that it works with all the other nodes as well (all have of course the same program) without defining all border cases.
For a node to join, it sends a join message first, which gets passed to the correct node (here it's just the base node) and then answered with a notify message.
With these two messages the predecessor of the base node and the successor of the existing nodes get set. But how does the other property get set? I know, that occasionally the nodes send a stabilise message to their successor, which compares it to its predecessor and returns it with a notify message and the predecessor in case it varies from the sender of the message.
Now, the base node can't send a message, because it doesn't know its successor, the new node can send one, but the predecessor is already valid.
I am guessing, both properties should point to the other node in the end, to be fully joined.
Here another diagram what I think should be the sequence i the third node joins. But again, when do I update the properties based on a stabilise message, when do I send a notify message back? In the diagram it is easy to see, but in code it is hard to decide.
Th trick is here to set the successor to the same value as the predecessor if it is NULL after the join-message has been received. Everything else gets handled nicely by the rest of the protocol.

data model for notification in social network?

I build a social network with Neo4j, it includes:
Node labels: User, Post, Comment, Page, Group
Relationships: LIKE, WRITE, HAS, JOIN, FOLLOW,...
It is like Facebook.
example: A user follow B user: when B have a action such as like post, comment, follow another user, follow page, join group, etc. so that action will be sent to A. Similar, C, D, E users that follow B will receive the same notification.
I don't know how to design the data model for this problem and I have some solutions:
create Notification nodes for every user. If a action is executed, create n notification for n follower. Benefit: we can check that this user have seen notification, right? But, number of nodes quickly increase, power of n.
create a query for every call API notification (for client application), this query only get a action list of users are followed in special time (24 hours or a 2, 3 days). But Followers don't check this notification seen or yet, and this query may make server slowly.
create node with limited quantity such as 20, 30 nodes per user.
Create unlimited nodes (include time of action) on 24 hours and those nodes has time of action property > 24 hours will be deleted (expire time maybe is 2, 3 days).
Who can help me solve this problem? I should chose which solution or a new way?
I believe that the best approach is the option 1. As you said, you will be able to know if the follower has read or not the notification. About the number of notification nodes by follower: this problem is called "supernodes" or "dense nodes" - nodes that have too many connections.
The book Learning Neo4j (by Rik Van Bruggen, available for download in the Neo4j's web site) talk about "Dense node" or "Supernode" and says:
"[supernodes] becomes a real problem for graph traversals because the graph
database management system will have to evaluate all of the connected
relationships to that node in order to determine what the next step
will be in the graph traversal."
The book proposes a solution that consists in add meta nodes between the follower and the notification (in your case). This meta node should got at most a hundred of connections. If the current meta node reaches 100 connections a new meta node must be created and added to the hierarchy, according to the example of figure, showing a example with popular artists and your fans:
I think you do not worry about it right now. If in the future your followers node becomes a problem then you will be able to refactor your database schema. But at now keep things simple!
In the series of posts called "Building a Twitter clone with Neo4j" Max de Marzi describes the process of building the model. Maybe it can help you to make best decisions about your model!

Mnesia: reading remote node data in {local_content, true} mode

Is there a way to do local writes and and global reads ( without replication ) using mnesia. Eg: node A writes to its local DB and node B reads from node A's DB.
Node B does not have any data of its own, apart from the schema information stored locally.
According to the documentation, {local_content, true} seems like what I need to use, but I have been unsuccessful trying to get node B to read node A's data.
My schema and table configuration look like this:
On nodeA#ip1:
net_adm:ping('nodeB#ip2').
rd(user, {name, nick}).
mnesia:create_schema([node()|nodes()]).
mnesia:start().
mnesia:create_table(user, [ {local_content, true},
{disc_copies, [node()]},
{attributes,record_info(fields, user) }]).
%% insert data and list rows on nodeA
%% WORKS
On nodeB#ip2:
mnesia:start().
%% code to list rows from user table on nodeA
%% throws an ERROR saying table does not exist.
Is the configuration wrong or can this be done in any other way?
I don't think you can do it the way you mention. Another way of doing it would probably be to make an rpc call to node A and get the data that way. There is no point in using mnesia to do the read from node B because it will essentially just do an RPC anyway.
So node B should be:
rpc:call(nodeA#ip1, mnesia, read, ....).
Hope this is what you [somewhat] needs.
EDIT:
Oh and I forgot to mention that you don't need the schema on both nodes for this to work. This is assuming that Node B doesn't really care about sharing any other data with Node A it just reads it; In other words just keep all the mnesia stuff on Node A and just do RPC calls from Node B.
You may need to add node B to the schema's extra_db_nodes config thingy. You shouldn't have to do that if it's a disc based db, but in RAM it's mandatory to make it do what you want.
Not sure about the specifics, i may be confusing where to put stuff. I'm not too experienced with mnesia, but the docs indicate that you should do this.

Resources