Select all nodes that are 1 or 3 nodes away from a gene of interest - cytoscape

I am working with a network and as a filter I would like to keep only genes that are related to a specific gene.
How can I select only those genes that are 2 or 3 nodes maximum far from a single node?
Ty

If you select the node in the Cytoscape network view, then hit Control-6 (or Apple-6 on a Mac), you'll select all the first neighbors of that node. Do it again to select all of the first neighbors of those nodes and you'll have selected all of the first and second neighbors of the original node. The same thing can be done by using Cytoscape automation (commands, RCy3, or py2cytoscape), and by the menu using Select->Nodes->First neighbors of selected nodes->Undirected.
-- scooter

Related

NEO4J How to make graph with relationships

I am completely new to NEO4j and using it for the first time ever now for my masters program. Ive read the documentation and watched tutorials online but can’t seem to figure out how I can represent my nodes in the way I want.
I have a dataframe with 3 columns, the first represents a page name, the second also represents a page name, and the third represents a similarity score between those two pages. How can I create a graph in NEO4J where the nodes are my unique page names and the relationships between nodes are drawn if there is a similarity score between them (so if the sim-score is 0 they don’t draw a relationship)? I want to show the similarity score as the text of the relationship.
Furthermore, I want to know if there is an easy way to figure out which node had the most relationships to other nodes?
I’ve added a screenshot of the header of my DF for clarity https://imgur.com/a/pg0knh6. I hope anyone can help me, thanks in advance!
Edit: What I have tried
LOAD CSV WITH HEADERS FROM 'file:///wiki-small.csv' AS line
MERGE (p:Page {name: line.First})
MERGE (p2:Page {name: line.Second})
MERGE (p)-[r:SIMILAR]->(p2)
ON CREATE SET r.similarity = toFloat(line.Sim)
Next block to remove the similarities relationships which are 0
MATCH ()-[r:SIMILAR]->() WHERE r.Sim=0
DELETE r
This works partially. As in it gives me the correct structure of the nodes but doesn't give me the similarity scores as relationship labels. I also still need to figure out how I can find the node with the most connections.
For the first question:
How can I create a graph in NEO4J where the nodes are my unique page names and the relationships between nodes are drawn if there is a similarity score between them (so if the sim-score is 0 they don’t draw a relationship)?
I think a better approach is to remove in advance the rows with similarity = 0.0 before ingesting them into Neo4j. Could it be something feasible? If your dataset is not so big, I think it is very fast to do in Python. Otherwise the solution you provide of deleting after inserting the data is an option.
In case of a big dataset, maybe it's better if you load the data using apoc.periodic.iterate or USING PERIODIC COMMIT.
Second question
I want to know if there is an easy way to figure out which node had the most relationships to other nodes?
This is an easy query. Again, you can do it with play Cypher or using APOC library:
# Plain Cypher
MATCH (n:Page)-[r:SIMILAR]->()
RETURN n.name, count(*) as cat
ORDER BY cnt DESC
# APOC
MATCH (n:Page)
RETURN apoc.node.degree(n, "SIMILAR>") AS output;
EDIT
To display the similarity scores, in Neo4j Desktop or in the others web interfaces, you can simply: click on a SIMILARITY arrow --> on the top of the running cell the labels are shown, click on the SIMILAR label marker --> on the bottom of the running cell, at the right of Caption, select the property that you want to show (similarity in your case)
Then all the arrows are displayed with the similarity score
To the second question: I think you should keep a clear separation between the way you store data and the way you visualize it. Having the similarity score (a property of the SIMILARITY edge) as a "label" is something that is best dealt with by using an adequate viz library or platform. Ours (Graphileon) could be such a platform, although there are also others.
We offer the possibility to "style" the edges with so-called selectors like
"label":"(%).property.simScore" that would use the simScore as a label. On top of that you could do thing like
"width":"evaluate((%).properties.simScore < 0.500 ? 3 : 10)"
or
"fillColor":"evaluate((%).properties.simScore < 0.500 ? grey : red)"
to distinguish visually high simScores.
Full disclosure : I work for Graphileon.

How to see all the reachable nodes from a selected node in a directed graph in Gephi

I have a small directed graph which is a metadata basically. A single node is a table and an edge in this context means table1 feeds to table2 (query of table2 refers to table1). So when a certain table is impacted for some reason , I want to understand which all downstream tables are affected by this. So if a node A is affected , I want all the nodes reachable from that node. So say the node table is:-
Nodes = {A,B,C,D}
Edges = { (A,B) , (B,C) , (C,D) }
then if A is impacted , I want to see the entire impact list i.e. {B,C,D}.
I should be able to click on a given node in Gephi and it should highlight entire subgraph which the nodes reachable (since it's a directed graph) from the node I clicked on.

Gephi: filter nodes and its neighbors

I'm using Gephi to filter down a node using its ID. But, I need to change this so that all neighbors (connected nodes) also get filtered too. I tried using the "Neighbor Network" filter in conjunction to the Label filter but that doesn't seem to work. I still see just one node selected.

Graph: Creating shortest marked path to connect two nodes

Given an undirected, unweighted graph in which some nodes are marked, is there an efficient way to find the unmarked nodes between node A and B which would create a "marked" path from A to B when they are marked? The number of those "bridge" nodes should be minimal.
For example, in the graph below there would be two minimal ways to connect node A to B. One possibility would be to mark the node labelled 1, the other possibility would be to mark node 2.
Convert your graph into a directed, weighted graph such that:
the weight of each edge going into a marked node is set to 0
the weight of each edge going into an unmarked node is 1
Find all lowest-cost paths from A to B.

Singly connected Graph?

A singly connected graph is a directed graph which has at most 1 path from u to v ∀ u,v.
I have thought of the following solution:
Run DFS from any vertex.
Now run DFS again but this time starting from the vertices in order of decreasing finish time. Run this DFS only for vertices which are not visited in some previous DFS. If we find a cross edge in the same component or a forward edge, then it is not Singly connected.
If all vertices are finished and no such cross of forward edges, then singly connected.
O(V+E)
Is this right? Or is there a better solution.
Update : atmost 1 simple path.
A graph is not singly connected if one of the two following conditions satisfies:
In the same component, when you do the DFS, you get a road from a vertex to another vertex that has already finished it's search (when it is marked BLACK)
When a node points to >=2 vertices from another component, if the 2 vertices have a connection then it is not singly connected. But this would require you to keep a depth-first forest.
A singly connected component is any directed graph belonging to the same entity.
It may not necessarily be a DAG and can contain a mixture of cycles.
Every node has atleast some link(in-coming or out-going) with atleast one node for every node in the same component.
All we need to do is to check whether such a link exists for the same component.
Singly Connected Component could be computed as follows:
Convert the graph into its undirected equivalent
Run DFS and set the common leader of each node
Run an iteration over all nodes.
If all the nodes have the same common leader, the undirected version of the graph is singly connected.
Else, it contains of multiple singly connected subgraphs represented by their corresponding leaders.
Is this right?
No, it's not right. Considering the following graph which is not singly connected. The first component comes from a dfs beginning with vertex b and the second component comes from a dfs beginning with vertex a.
The right one:
Do the DFS, the graph is singly connected if all of the three following conditions satisfies:
no foward edges
no cross edges in the same component
there is no more than 1 cross edges between any two of components

Resources