Findings common paths in two graphs using python-networkx - path

I have two DiGraphs, say G and H, and I would like to count how many paths of G are part of H.
For any node pairs (src, dst) I can generate the paths between them using the 'all_simple_paths' function to get the generators:
G_gen = nx.all_simple_paths(G, src, dst)
H_gen = nx.all_simple_paths(H, src, dst)
Since the amount of paths is considerably high (the graphs have typically 100 nodes) I cannot resort to building lists etc.. (e.g. list(G_gen)) so I am wondering if there are smarter ways to deal with it. In addition, I would also like to distinguish based on the path lengths.
.. or maybe a better solution can be found with a different module ?
Thanks in advance for any help on this.
Thierry

I wonder if there is some reason why nx.intersection (see here) wouldn't work here? I'm not sure if it checks for direction under the hood but it doesn't seem to force outputs to standard Graph output either. Below might work:
# Create a couple of random preferential attachment graphs
G = nx.barabasi_albert_graph(100, 5)
H = nx.barabasi_albert_graph(100, 5)
# Convert to directed
G = G.to_directed()
H = H.to_directed()
# Get intersection
intersection = nx.intersection(G, H)
# Print info for each
print(nx.info(G))
print(nx.info(H))
print(nx.info(intersection))
which outputs:
>>> DiGraph with 100 nodes and 950 edges
>>> DiGraph with 100 nodes and 950 edges
>>> DiGraph with 100 nodes and 176 edges
The nodes are all shared in the example since the node ids are just simple integers and so they follow the same generation index. With real data I suppose your node sets might not be equivalent like here and you probably will see differences there too.
On the path lengths I'm not quite sure how you would go about that. The intersection just checks which nodes and edges are shared between two graphs and returns those that are in both, unaware of any other conditions I suspect. There might be a way to impose some additional constraints by adapting the source code with of the intersection function with some conditional checks.
I guess this doesn't check the number of paths but rather the number of edges, so I suppose you're looking for something more specific than this. But at the very least no path can exist outside of the intersection, since all shared paths must contain the same edges in both (since if an edge is missing from a path in either, it cannot exist as a path in the shared solution).
Hope this helps in some way shape or form, though I feel I've oversimplified your question quite a bit.
EDIT: Intuitively, the full solution to your question might be to simply enumerate all possible paths in the intersection.

Related

Return path - Bug Algorithms

I am trying to develop a simple algorithm in a MBSE tool to obtain a path between a start and an end points.
Here a summary:
Example
I have all informations regarding the start and end point, as well as I know everything for each polygon points (x and y positions, distance to the next point, indexes). What I do at the moment is to check the intersection between the m-line (Start-End line) and each separate polygon line,** one at the time** and to return the intersection points sorted by distance to the start point and the index where this intersection is found. (see Picture)
I think this is similar to the Bug2 algorithm, however I know in advance all the coordinates.
I would like to obtain the green path shown in the picture: [End, E, A, 29, B, C, 53, 52, 51, D, Start]. Ideally this path is the shortest to reach the end point. The problem that I am having at the moment is that I cannot extract the path and to check the shortest distance.
Any good ideas on how I could do that? I am missing a good logic to do that. Any suggestion/articles would be really appreciated. Thanks in advance.
(I am assuming: -all point sorted in counterclockwise order -I have to followthe polygon edges, I cannot cross the polygons)
I am able to found the intersection points, and the index where the intersections are happening. This information is local for each polygons, polygons are not ordered. I have tried to slice arrays and try to merge together the coordinates, however I don't like this approach and was looking for something more clean to merge together the local informations available for each polygons and to make them a path. I have explored Dijkstra algorithm to solve the issue, but maybe there is also something else.

Cypher (Neo4j) Match all paths with specific length and value

I'm new to Cypher and Neo4j, but I find it really interesting and are trying to use it to solve a math problem that I have. In order to make the problem easy to illustrate, I've scaled it down and hoping you can help me find the right logic.
The Math problem: Given a set of tiles, how many ways can you select 3 tiles, with the sum less than x?
In my example, let's just say that I have 5 tiles (100, 100, 80, 80, 50), and that I have to include at least one 100-tile, and that x is 270.
Since the order doesn't matter, the way I think about the problem is that I start at the highest nr, and then from there can choose to go to either the same nr again, or the next lower number, or the second lower number. This would mean, that starting at 100, I could choose to select either another 100, or 80 (the next lower one), or 50 (the second lower one).
So far, I'm able to define a path starting at 100 and going 2 steps further to m:
MATCH path = (n:Node {value:100})-[:CONNECTED*2]-(m)
QUESTION:
How do I find all paths with a specific sum of the nodes.value?
Since the order doesn't matter, I'm only interested in the unique one-way paths. (Meaning, for example that if I get one path as 100-80-50, then Im not interested in the path 50-80-100 since that contains the exact same tiles, just different order).
Thanks!
you means this?
MATCH path = (n:Node {value:100})-[:CONNECTED*2]-(m)
WITH REDUCE(x=0,n in nodes(path)|x+n.value) as expected, [n in nodes(path)|n.value] as listNode
WHERE expected >100
RETURN listNode

Efficiently Finding all paths between 2 nodes in a directed graph - RGL Gem

I am struggling to find 1 efficient algorithm which will give me all possible paths between 2 nodes in a directed graph.
I found RGL gem, fastest so far in terms of calculations. I am able to find the shortest path using the Dijkstras Shortest Path Algorithm from the gem.
I googled, inspite of getting many solutions (ruby/non-ruby), either couldn't convert the code or the code is taking forever to calculate (inefficient).
I am here primarily if someone can suggest to find all paths using/tweaking various algorithms from RGL gem itself (if possible) or some other efficient way.
Input of directed graph can be an array of arrays..
[[1,2], [2,3], ..]
P.S. : Just to avoid negative votes/comments, unfortunately I don't have inefficient code snippet to show as I discarded it days ago and didn't save it anywhere for the record or reproduce here.
The main problem is that the number of paths between two nodes grows exponentially in the number of overall nodes. Thus any algorithm finding all paths between two nodes, will be very slow on larger graphs.
Example:
As an example imagine a grid of n x n nodes each connected to their 4 neighbors. Now you want to find all paths from the bottom left node to the top right node. Even when you only allow for moves to the right (r) and moves up (u) your resulting paths can be described by any string of length 2n with equal number of (r)'s and (u)'s. This will give you "2n choose n" number of possible paths (ignoring other moves and cycles)

Mapping points from Euclician 2-space onto a Poincare disc

For some reason it seems that everyone writing webpages about Poincare discs is only concerned with how to represent lines and measure distances.
I'd like to morph a collection of 2D points (as defined by x,y coordinates in the Euclidian plane) onto a Poincare disc, but I have no idea what the algorithm is supposed to be like. At this point I don't even know if it's possible to create a mapping between Euclidian 2-space and a Poincare disc...
Any pointers?
Goodwill,
David
You describe your data as a collection of points. But from your comments, you want to make lines in the plane still map to lines in the disk. You seem to want to preserve the "structure" of the space somehow, which is probably why you use the term "morph". I think that you want a conformal map.
There is no conformal bijection between the disk and the plane. There is such a mapping between the half-plane and the disk, and it preserves "lines", but not the kind that you want, unfortunately.
You said "I don't even know if it's possible to create a mapping" ... there are a number of mappings for you to choose from (see the Unit Disk page for an example) but there are none with all the features you seem to want.
If I understand everything correctly, the answer you get on the other forum is for the Beltrami–Klein model. Once you have that, you can get to the coordinates in the Poicare' disk with
p = b / (1 + sqrt(1 - b * b))
Where p is the vector of coordinates in the Poincare' disk (i.e. what you need) and b is the one in the Beltrami–Klein model (i.e. what you get from the other answer).

How to display the results of multiple comparisons

If you compare two sets of data (such as two files), the differences between these sets can be displayed in two columns, or two panes, such as WinMerge does.
But are there any visual paradigms to display the differences between multiple data sets?
Update
The starting point of my question was the assumption that displaying differences between 2 files is relatively easy, as I mentioned WinMerge, whereas comparing 3 or more text files turns out to be more complicated, as there will be more and more differences between, say, different versions of a document that have been created over time.
How would you highlight parts of the file that are the same in 2 versions, but different from other versions?
The data sets I have in mind are objects (A, B, C, ...) which may or may not exist and have properties (a, b, c, ...) which may be set or not set.
Example:
Set 1: A(a, b, c), B(b, c), C(c)
Set 2: A(a, b, c), B(b), C(c)
Set 3: A(a, b), B(b)
If you compare 2 sets, e.g. 1 and 2, the difference would be in B(c). Comparing sets 2 and 3 results in the difference A(c) and C().
If you compare all 3 sets, you end up with 3 comparisons (n * (n-1) / 2)
I have a different view than some of those who provided Answers--i.e., that you need to further specify the problem. The abstraction level is about right. Further specification would make the problem easier, but the solution less useful.
A couple of years ago, i saw a graphic on ProgrammableWeb--it compared the results from a search on Yahoo with the results from the same search on Google. There's a lot of information to covey: some results are in both sets, some in just one, and the common results will have different positions in the respective engine's results, which somehow has to be shown.
I like the graphic and reimplemented it in Matplotlib (a Python scientific plotting library). Below is an example using some random points as well as python code i used to generate it:
from matplotlib import pyplot as PLT
xvals = NP.array([(2,3), (5,7), (8,6), (1.5,1.8), (3.0,3.8), (5.3,5.2),
(3.7,4.1), (2.9, 3.7), (8.4, 6.1), (7.1, 6.4)])
yvals = NP.tile( NP.array([5,3]), [10,1] )
fig = PLT.figure()
ax1 = fig.add_subplot(111)
ax1.plot(x, y, "-", lw=3, color='b')
ax1.plot(x, y2, "-", lw=3, color='b')
for a, b in zip(xvals, yvals) : ax1.plot(a,b,'-o',ms=8,mfc='orange', color='g')
PLT.axis("off")
PLT.show()
This model has some interesting features: (i) it actually deals with 'similarity' on a per-item basis (the vertically-oriented line connecting the dots) rather than aggregate similarity; (ii) the degree of similarity between two data points is proportional to the angle of the line connecting them--90 degrees if they are equal, with a decreasing angle as the difference increases; this is very intuitive; (iii) cases in which a point in one data set is not present in the second data set are easy to show--a point will appear on one of the two lines but without a line connecting it to a point on the other line.
This model works well for comparing search results because each search result has a 'score' (its index, or order in the Results List). For other types of data, you might have to assign a score to each data point--a similarity metric might i suppose (in a sense, that's actually what the search result order is, an distance from the top of the list)
Since there has been so much work into displaying a diff of two files, you might start by expressing your 'multiple data sets' in an appropriate text format, then using whatever you want to show a diff between those text formats.
But you should tell us more about your data sets!
I experimented a bit, and implemented two displays:
Matrix
Timeline
I agree with Peter, you should specify what type your data is and what you wish to bring out in the comparison.
Depending on the nature of the data/comparison you can consider different visualisations. Is your data ordered or unordered? How many things are you comparing, i.e. fine grain or gross comparison?
Examples:
Visualizing a comparison of unordered data could just be plotting the two histograms of your sets (i.e. distributions):
image source
On the other hand, comparing a huge ordered dataset like DNA can be done innovatively.
Also, check out visual complexity, it's a great resource for interesting visualization.

Resources