Using Gephi 0.9.1, I want to colour my edges based on an attribute similar to how it is shown in the following image (note that this image relates to Gephi 0.8.2, but conceptually it is the same task).
The attribute that I would like to use has been imported in the spreadsheet e.g. t1 or t2 in the below image.
However, in the relevant field I only get the option of weight.
How can I use an attribute from a column in the data laboratory?
The attributes t1, t2 etc are node attributes not edge attributes.
Weight is an edge attribute, in the data lab table switch to edges to see the attributes for edge colouring.
Related
I am completely new to NEO4j and using it for the first time ever now for my masters program. Ive read the documentation and watched tutorials online but can’t seem to figure out how I can represent my nodes in the way I want.
I have a dataframe with 3 columns, the first represents a page name, the second also represents a page name, and the third represents a similarity score between those two pages. How can I create a graph in NEO4J where the nodes are my unique page names and the relationships between nodes are drawn if there is a similarity score between them (so if the sim-score is 0 they don’t draw a relationship)? I want to show the similarity score as the text of the relationship.
Furthermore, I want to know if there is an easy way to figure out which node had the most relationships to other nodes?
I’ve added a screenshot of the header of my DF for clarity https://imgur.com/a/pg0knh6. I hope anyone can help me, thanks in advance!
Edit: What I have tried
LOAD CSV WITH HEADERS FROM 'file:///wiki-small.csv' AS line
MERGE (p:Page {name: line.First})
MERGE (p2:Page {name: line.Second})
MERGE (p)-[r:SIMILAR]->(p2)
ON CREATE SET r.similarity = toFloat(line.Sim)
Next block to remove the similarities relationships which are 0
MATCH ()-[r:SIMILAR]->() WHERE r.Sim=0
DELETE r
This works partially. As in it gives me the correct structure of the nodes but doesn't give me the similarity scores as relationship labels. I also still need to figure out how I can find the node with the most connections.
For the first question:
How can I create a graph in NEO4J where the nodes are my unique page names and the relationships between nodes are drawn if there is a similarity score between them (so if the sim-score is 0 they don’t draw a relationship)?
I think a better approach is to remove in advance the rows with similarity = 0.0 before ingesting them into Neo4j. Could it be something feasible? If your dataset is not so big, I think it is very fast to do in Python. Otherwise the solution you provide of deleting after inserting the data is an option.
In case of a big dataset, maybe it's better if you load the data using apoc.periodic.iterate or USING PERIODIC COMMIT.
Second question
I want to know if there is an easy way to figure out which node had the most relationships to other nodes?
This is an easy query. Again, you can do it with play Cypher or using APOC library:
# Plain Cypher
MATCH (n:Page)-[r:SIMILAR]->()
RETURN n.name, count(*) as cat
ORDER BY cnt DESC
# APOC
MATCH (n:Page)
RETURN apoc.node.degree(n, "SIMILAR>") AS output;
EDIT
To display the similarity scores, in Neo4j Desktop or in the others web interfaces, you can simply: click on a SIMILARITY arrow --> on the top of the running cell the labels are shown, click on the SIMILAR label marker --> on the bottom of the running cell, at the right of Caption, select the property that you want to show (similarity in your case)
Then all the arrows are displayed with the similarity score
To the second question: I think you should keep a clear separation between the way you store data and the way you visualize it. Having the similarity score (a property of the SIMILARITY edge) as a "label" is something that is best dealt with by using an adequate viz library or platform. Ours (Graphileon) could be such a platform, although there are also others.
We offer the possibility to "style" the edges with so-called selectors like
"label":"(%).property.simScore" that would use the simScore as a label. On top of that you could do thing like
"width":"evaluate((%).properties.simScore < 0.500 ? 3 : 10)"
or
"fillColor":"evaluate((%).properties.simScore < 0.500 ? grey : red)"
to distinguish visually high simScores.
Full disclosure : I work for Graphileon.
I'm trying to create a model which allows users to navigate between positions (nodes) using various techniques (edges). Basically to traverse the positions graph using their own specific edges, which are unique and available just for them.
I want every user to be able to create their own edges(techniques) between nodes (positions). I've considered having technique edges to all have the same name/type - something like "LEADS_TO", but their properties will be different (name, description and most importantly, reference to user who is allowed to use the edge - basically a creator of that edge).
This means that during graph traversal, I'll have to filter only edges which have the the createdBy property matching with the userId.
Also, this model expects that if there will be 1000 users using the app, there will likely be 1000 unique edges (techniques) between 2 nodes (positions).
Would this be correct approach or is my graph thinking/understanding conceptually wrong? Thanks!
There are 3 ways to do what you want :
an edge with a property user_id that is a string. So like you said you will have multiple edges between your nodes pos1 & pos2 (on for each user)
an edge with a property user_id that is an array of string. So you will have one edge between your nodes pos1 & pos2, but the size of the array will match the number of user
prefix each edge's type with the user_id : USER_2_LEADS_TO
The choice depends on the type of your queries and also on hte volumetry, ie the average number of relationship you will have between your nodes pos1 & pos2.
As a first approach, your choice is good.
Cheers
I want to change the color of my nodes based on their properties:
Say I have many "Person" nodes. And I want those who live in New York to be red and those who live in Los Angeles to be blue. How would I write that. In cypher or in py2neo?
The styling of nodes and relationships in Neo4j Browser is controlled by a graph style sheet (GRASS), a cousin of CSS. You can view the current style by typing :style in the browser. To edit it, you can click on nodes and relationships and pick colors and sizes, or you can view the style sheet (:style), download it, make changes, and drag-n-drop it back into the view window.
Unfortunately for your case, color can only be controlled a) for all nodes and all relationships or b) for nodes by label and relationships by type. Properties can only be used for the text displayed on the node/rel.
It is not possible to interact with neo4j browser pro-grammatically. But the end goal could be achieved through a hack.
Even though I am a bit late here want to help others who might be finding a way. It is not possible to change the color of the nodes based on the property but there is a way it can be achieved by creating nodes based on the property. Keep in mind that after applying these queries your data wont be the same. So it is always a good idea to keep a backup of your data.
This is how labels are colored by default (Before):
Color based on the property
Suppose there is a label called Case with a property nationality and you want to color the nodes based on nationality. So following query could be used to create labels out of nationality property. For this you will need to install apoc library. check here for installation.
// BY NATIONALITY
MATCH (n:Case)
WITH DISTINCT n.nationality AS nationality, collect(DISTINCT n) AS persons
CALL apoc.create.addLabels(persons, [apoc.text.upperCamelCase(nationality)]) YIELD node
RETURN *
This will return all the people by nationality. Now you can color by country of nationality. Below shows an example.
Color based on the property and load with other labels
Lets say you also have a label called Cluster.The cases are attached to clusters via relationships. Just change the query to following to get the clusters with their relationships to cases.
//BY NATIONALITY WITH CLUSTERS
MATCH (n:Case),(c:Cluster)
WITH DISTINCT n.nationality AS nationality,
collect(DISTINCT n) AS persons,
collect(DISTINCT c) AS clusters
CALL apoc.create.addLabels(persons, [apoc.text.upperCamelCase(nationality)]) YIELD node
RETURN *
It will return cases and clusters with all the relationships. Below shows example.
Please leave an up vote if this was helpful and want to let others know that this is an acceptable answer. Thank you.
You cannot include formatting of the output in Cypher queries in the neo4j browser. Currently, the only way is to change the graph view manually or load a graph style file.
See tutorial here: http://neo4j.com/developer/guide-neo4j-browser/
Also, you cannot interact with the neo4j browser from py2neo.
If you are happy setting the color through a graphical user interface rather than programatically, Neo4j also supplies a data exploration addon named bloom. When using this addon (now automatically installed when using neo4j desktop), it is possible to set node color based on its properties.
In the example below, movies released after 2002 are colored green.
Assuming we have 2 document types: TagGroup [DisplayName] and TagGroupItem [DisplayName] with TagGroupItems being children of TagGroup. That said, assume we have the following data:
Color
- Red
- Green
- Blue
Finish
- Aluminum
- Plastic
Color and Finish are both TagGroups. What kind of data type would allow for the another item to be associated with 1 or more tag group items? That is, an item could be Color-Red and Finish-Alumimum or just Color-Red. Aside from manually creating a drop down for each tag group and associating it to an item, how can this be more streamlined?
You may try to do this with Multi-Node Tree Picker of the great uComponents package.
Create a datatype based on Multi-node tree picker, configure it to allow only TagGroupItems to be selected (using XPathFilter).
Every document type which needs to be associated with x TagGroupItems then simply needs one property using this datatype.
This of course would allow to choose more than one TagGroupItem from the same TagGroup (for example red and green). If you'd like to enforce having only 0 or one TagGroupItem linked, you could define a datatype for each TagGroup, limited to their TagGroupItems and Maximum node selection set to 1.
I need to use IBM Informix for my project where I have point coordinates and I need to find which points are present in query rectangular region.
Informix has spatial datablade module with ST_POINT and ST_POLYGON data objects.
I know how to create, insert and create r-tree index on tables with such objects.
But problem is how to do a SELECT statement, something which list all the points in a particular rectangular region.
You've got the Spatial Datablade documentation at your fingertips? It is available in the IDS 11.50 Info Centre.
For example, the section in Chapter 1 discusses performing spatial queries:
Performing Spatial Queries
A common task in a GIS application is to retrieve the visible subset of spatial data for display in a window. The easiest way to do this is to define a polygon representing the boundary of the window and then use the SE_EnvelopesIntersect() function to find all spatial objects that overlap this window:
SELECT name, type, zone FROM sensitive_areas
WHERE SE_EnvelopesIntersect(zone,
ST_PolyFromText('polygon((20000 20000,60000 20000,60000 60000,20000 60000,20000 20000))', 5));
Queries can also use spatial columns in the SQL WHERE clause to qualify the result set; the spatial column need not be in the result set at all. For example, the following SQL statement retrieves each sensitive area with its nearby hazardous waste site if the sensitive area is within five miles of a hazardous site. The ST_Buffer() function generates a circular polygon representing the five-mile radius around each hazardous location. The ST_Polygon geometry returned by the ST_Buffer() function becomes the argument of the ST_Overlaps() function, which returns t (TRUE) if the zone ST_Polygon of the sensitive_areas table overlaps the ST_Polygon generated by the ST_Buffer() function:
SELECT sa.name sensitive_area, hs.name hazardous_site
FROM sensitive_areas sa, hazardous_sites hs
WHERE ST_Overlaps(sa.zone, ST_Buffer(hs.location, 26400));
sensitive_area Summerhill Elementary School
hazardous_site Landmark Industrial
sensitive_area Johnson County Hospital
hazardous_site Landmark Industrial