Sizing nodes according to input weighting not connectivity - gephi

I am trying to use Gephi to help graph interview analysis results. The relationship map is only used to describe conventional connections and life cycles. What I would like to do is to size the nodes based on the number of interview responses that talk about the node, not the number of connections it has or the weighting of those connections. Can Gephi do this and if so, how do I do it please?
I have loaded in node weightings and can see this as part of node labels, but haven't been able to find a way of this having an effect on node size.
Many thanks

Data input field - change input format to integer

You can load the graph in gexf format adding a float attribute and add this attribute to ALL the nodes. It would like something like:
```
...
...
```
Once imported in Gephi, just go to the appearance tab and it will appear as one more attribute in "ranking" drop-down list.
If any problem with gefx format, let me know and I'll will share a whole example (just trying to remain short :-)
Regards

Related

pre-filter large network in cytoscape

I'm have a large network of ~300K nodes that my machine has a hard time plotting with Cytpscape (Desktop version under Windows).
I know that the network has discrete groups that are not interconnected - I also have the id of those groups as a node attribute.
I want to be able to graph each group based on what id I select.
I tried achieving this with the filter (Cytoscape gave me the option to not plot the graph when opening it the first time - "Do you want to create a view for your large network now?") but it still seems to try to plot the entire graph when setting the filter and then clicking on "Create View".
So in short: Is there any way to "pre-filter" the graph, or to somehow else cut it up so that cytoscape will plot the one I want?
Any thoughts would be appreciated.
You are almost there. Once you have the filter set, then you need to create a subnetwork (see File > New Network), then you can create a view of that subset (or it will automatically be created if the node count is below the threshold).

NEO4J How to make graph with relationships

I am completely new to NEO4j and using it for the first time ever now for my masters program. Ive read the documentation and watched tutorials online but can’t seem to figure out how I can represent my nodes in the way I want.
I have a dataframe with 3 columns, the first represents a page name, the second also represents a page name, and the third represents a similarity score between those two pages. How can I create a graph in NEO4J where the nodes are my unique page names and the relationships between nodes are drawn if there is a similarity score between them (so if the sim-score is 0 they don’t draw a relationship)? I want to show the similarity score as the text of the relationship.
Furthermore, I want to know if there is an easy way to figure out which node had the most relationships to other nodes?
I’ve added a screenshot of the header of my DF for clarity https://imgur.com/a/pg0knh6. I hope anyone can help me, thanks in advance!
Edit: What I have tried
LOAD CSV WITH HEADERS FROM 'file:///wiki-small.csv' AS line
MERGE (p:Page {name: line.First})
MERGE (p2:Page {name: line.Second})
MERGE (p)-[r:SIMILAR]->(p2)
ON CREATE SET r.similarity = toFloat(line.Sim)
Next block to remove the similarities relationships which are 0
MATCH ()-[r:SIMILAR]->() WHERE r.Sim=0
DELETE r
This works partially. As in it gives me the correct structure of the nodes but doesn't give me the similarity scores as relationship labels. I also still need to figure out how I can find the node with the most connections.
For the first question:
How can I create a graph in NEO4J where the nodes are my unique page names and the relationships between nodes are drawn if there is a similarity score between them (so if the sim-score is 0 they don’t draw a relationship)?
I think a better approach is to remove in advance the rows with similarity = 0.0 before ingesting them into Neo4j. Could it be something feasible? If your dataset is not so big, I think it is very fast to do in Python. Otherwise the solution you provide of deleting after inserting the data is an option.
In case of a big dataset, maybe it's better if you load the data using apoc.periodic.iterate or USING PERIODIC COMMIT.
Second question
I want to know if there is an easy way to figure out which node had the most relationships to other nodes?
This is an easy query. Again, you can do it with play Cypher or using APOC library:
# Plain Cypher
MATCH (n:Page)-[r:SIMILAR]->()
RETURN n.name, count(*) as cat
ORDER BY cnt DESC
# APOC
MATCH (n:Page)
RETURN apoc.node.degree(n, "SIMILAR>") AS output;
EDIT
To display the similarity scores, in Neo4j Desktop or in the others web interfaces, you can simply: click on a SIMILARITY arrow --> on the top of the running cell the labels are shown, click on the SIMILAR label marker --> on the bottom of the running cell, at the right of Caption, select the property that you want to show (similarity in your case)
Then all the arrows are displayed with the similarity score
To the second question: I think you should keep a clear separation between the way you store data and the way you visualize it. Having the similarity score (a property of the SIMILARITY edge) as a "label" is something that is best dealt with by using an adequate viz library or platform. Ours (Graphileon) could be such a platform, although there are also others.
We offer the possibility to "style" the edges with so-called selectors like
"label":"(%).property.simScore" that would use the simScore as a label. On top of that you could do thing like
"width":"evaluate((%).properties.simScore < 0.500 ? 3 : 10)"
or
"fillColor":"evaluate((%).properties.simScore < 0.500 ? grey : red)"
to distinguish visually high simScores.
Full disclosure : I work for Graphileon.

Cytoscape Passthrough mapping

I have a question regarding passthrough mapping on Cytoscape.
Let us say I have my nodes that belong to discreet groups. Those groups appear on a different column that I call Cat. How can I make the node fill colour to be according to Cat? I know I can do it with discrete mapping, choosing the Cat colours individually, but what if I have loads of Cats? When I choose Passthrough mappping, which I do not know how it works, nothing happens.
Thanks for your help.
Best,
David R.
Passthrough mapping does sort of what it sounds like -- it maps the value in the column cell to the visual attribute. In this case, you are mapping whatever is in Cats to a Color. Now, if you have a value of "Group 1", and Cytoscape maps it to a color, you wind up with ... nothing since Cytoscape doesn't know how to do that mapping. My suggestion would be to use discrete mapping, but let Cytoscape choose the colors. If you right-click on the "Discrete Mapping" and then go to "Mapping Value Generators" you'll see a number of options for automatically assigning colors.
-- scooter

Integrate multiple same structure datasets in one database

I have 8 different datasets with the same structure. I am using Neo4j and need to query all of them at different points on the website I am developing. What would be the approaches at storing the datasets in one database?
One idea that comes to my mind is to supply for each node an additional property that would distinguish nodes of one dataset from nodes of the other ones. But that seems too repetitive and wrong for me. The other idea is just to create 8 databases and query them separately but how could I do that? Running each one in its own port seems crazy.
Any suggestions would be greatly appreciated.
If your datasets are in a tree structure, you could add a different root node to each of them that you could use for reference, similar to GraphAware TimeTree. Another option (better than a property, I think) would be to differentiate each dataset by adding a specific label to nodes from that dataset (i.e. all nodes from "dataset A" get a :DataSetA label)
I imagine that the specific structure of your dataset may yield other options. For example, if you always begin traversals of the dataset from a few set locations, you only need to be able to determine which dataset the entry points are a part of, because once entered, all traversals would be made within the same dataset <-- if that makes sense.

Change node color based on properties - neo4j

I want to change the color of my nodes based on their properties:
Say I have many "Person" nodes. And I want those who live in New York to be red and those who live in Los Angeles to be blue. How would I write that. In cypher or in py2neo?
The styling of nodes and relationships in Neo4j Browser is controlled by a graph style sheet (GRASS), a cousin of CSS. You can view the current style by typing :style in the browser. To edit it, you can click on nodes and relationships and pick colors and sizes, or you can view the style sheet (:style), download it, make changes, and drag-n-drop it back into the view window.
Unfortunately for your case, color can only be controlled a) for all nodes and all relationships or b) for nodes by label and relationships by type. Properties can only be used for the text displayed on the node/rel.
It is not possible to interact with neo4j browser pro-grammatically. But the end goal could be achieved through a hack.
Even though I am a bit late here want to help others who might be finding a way. It is not possible to change the color of the nodes based on the property but there is a way it can be achieved by creating nodes based on the property. Keep in mind that after applying these queries your data wont be the same. So it is always a good idea to keep a backup of your data.
This is how labels are colored by default (Before):
Color based on the property
Suppose there is a label called Case with a property nationality and you want to color the nodes based on nationality. So following query could be used to create labels out of nationality property. For this you will need to install apoc library. check here for installation.
// BY NATIONALITY
MATCH (n:Case)
WITH DISTINCT n.nationality AS nationality, collect(DISTINCT n) AS persons
CALL apoc.create.addLabels(persons, [apoc.text.upperCamelCase(nationality)]) YIELD node
RETURN *
This will return all the people by nationality. Now you can color by country of nationality. Below shows an example.
Color based on the property and load with other labels
Lets say you also have a label called Cluster.The cases are attached to clusters via relationships. Just change the query to following to get the clusters with their relationships to cases.
//BY NATIONALITY WITH CLUSTERS
MATCH (n:Case),(c:Cluster)
WITH DISTINCT n.nationality AS nationality,
collect(DISTINCT n) AS persons,
collect(DISTINCT c) AS clusters
CALL apoc.create.addLabels(persons, [apoc.text.upperCamelCase(nationality)]) YIELD node
RETURN *
It will return cases and clusters with all the relationships. Below shows example.
Please leave an up vote if this was helpful and want to let others know that this is an acceptable answer. Thank you.
You cannot include formatting of the output in Cypher queries in the neo4j browser. Currently, the only way is to change the graph view manually or load a graph style file.
See tutorial here: http://neo4j.com/developer/guide-neo4j-browser/
Also, you cannot interact with the neo4j browser from py2neo.
If you are happy setting the color through a graphical user interface rather than programatically, Neo4j also supplies a data exploration addon named bloom. When using this addon (now automatically installed when using neo4j desktop), it is possible to set node color based on its properties.
In the example below, movies released after 2002 are colored green.

Resources