need help on cytoscape as I am starting up using this library but can't get answer on Cytoscape guide.
I created nodes and edges with specific weights.
I'd like to highlight for a given source and target nodes the 2 best paths. First in green and second highlighted in red.
I dont care using Dijkstra, astar, ... as long as it does the job. Both paths could follow same paths if no other choice (example if source node has only 1 connection to its neighbor). in order to filter out edge already used for first path, I create specific data (isPrimary and IsDiverse) in edge such as :
data: { id:'1', source: 'node1', target: 'node2', weight: 0, isPrimary: 0, isDiverse: 0 }. if primary path uses this edge , it will flag isPrimary to True. But cannot be able to make it work.
Do you have concrete example that would do the job as I am stuck since weeks now.
many thx for your supports.
A.
I am assuming that by best path you mean the shortest path.
The problem you describe is known as k-shortest path. Many algorithms have been developed to solve this problem. But unfortunately, as far as I know, none of them is implemented in Cytoscape.js.
Your best bet is to implement Yen's algorithm by utilizing the Dijkstra's algorithm in Cytoscape.js Here is a pseudocode (page 13).
Thx for your answer. I managed to add a parameters (data) onto each edge and flag it as True when used by the best path. Then to find the second best path that should not use the first best path, I create a condition to avoid using edges having flag sets to True.
Related
I'm trying to figure out the best way to analyse a grasshopper/rhino floor plan. I am trying to create a room map to determine how many doors it takes to reach an exit in a residential building. The inputs are the room curves, names and doors.
I have tried to use space syntax or SYNTACTIC, but some of the components are missing. Alot of the plugins I have been looking at are good at creating floor plans but not analysing them.
Your help would be greaty appreciated :)
You could create some sort of spine that goes through the rooms that passes only through doors, and do some path finding across the topology counting how many "hops" you need to reach the exit.
So one way to get the topology is to create a data structure (a tuple, keyValuePair) that holds the curve (room) and a point (the door), now loop each room to each other and see if the point/door of each of the rooms is closer than some threshold, if it is, store the relationship as a graph (in the abstract sense you don't really need to make lines out of it, but if you plan to use other plugins for path-finding, this can be useful), then run some path-finding (Dijkstra's, A*, etc...) to find the shortest distance.
As for SYNTACTIC: If copying the GHA after unblocking from the installation path to the special components folder (or pointing the folder from _GrasshopperDeveloperSettings) doesn't work, tick the Memory load *.GHA assemblies using COFF byte arrays option of the _GrasshopperDeveloperSettings.
*Note that SYNTACTIC won't give you any automatic topology.
If you need some pseudo-code just write a comment and I'd be happy to help.
I've been looking into Google Dataprep as an ETL solution to perform some basic data transformation before feeding it to a machine learning platform. I'm wondering if it's possible to use the Dataprep/Dataflow tools to split a dataset into train, test, and validation sets. Ideally I'm looking to do a stratified split on a target column, but for starters I'd settle for a simple uniform random split by percent of whole (e.g. 50% train, 30% validation, 20% test).
So far I haven't been able to find anything about whether this is even possible with Dataprep, so I'm wondering if anyone knows definitively if this is possible and, if so, how to accomplish it.
EDIT 1
Thanks #jakub-janoštík for getting me going in the right direction! I modified your answer slightly and came up with the following (in wrangle form):
case condition: customConditions cases: [false,0] default: rand() as: 'split_condition'
case condition: customConditions cases: [split_condition < 0.6,'train'],[split_condition >= 0.8,'test'] default: 'validation' as: 'dataset_type'
drop col: split_condition action: Drop
By assigning random values in a separate step, I got the guaranteed percentage split I was looking for. The flow ended up looking like this:
Image: final flow diagram with dataset splitting
EDIT 2
I just figured out how to do the stratified split too, so I thought I'd add it in case anyone else is trying to do this. Here's the rough steps:
Split your dataset based on whatever subpopulations you're targeting (e.g. target0, target1)
For each subpopulation, do the uniform random split described above (e.g. now you have target0-train, target0-test, target0-validation, target1-train, etc.)
For each set type (i.e. train, test, validation):
Create a new recipe from one of the sets
Edit the recipe, and use the Union transform to merge it with other datasets of the same type (e.g. target0-train union with target1-train). The union button is in the middle of the toolbar on the Edit Recipe page.
I hope that's helpful to someone!
I'm looking at the same problem and I was able to partially solve this using "case on custom condition" and "Random" functions. What I do is that I create new column named target and apply following logic:
After applying this you'll have new column with these 3 new labels and you can generate 3 new datasets by applying row filtering rules based on those values. Thing to keep in mind is that each time you'll run the job you'll get different validation set. So if you want to keep it fixed you need to use the dataset created in first run as input for future runs (and randomise only train and test sets).
If you need more control on the distribution of labels in your datasets there is ROWNUMBER window function that could potentially be used. But I haven't been able to make it work yet.
We are trying to find a way to create a full distance matrix in a neo4j database, where that distance is defined as the length of the shortest path between any two nodes. Of course, there is the shortestPath method but using a loop going through all pairs of nodes and calculating their shortestPaths get very slow. We are explicitely not talking about allShortestPaths, because that returns all shortest paths between 2 specific nodes.
Is there a specific method or approach that is fast for a large number of nodes (>30k)?
Thank you!
j.
There is no easier method; the full distance matrix will take a long time to build.
As you've described it, the full distance matrix must contain the shortest path between any two nodes, which means you will have to get that information at some point. Iterating over each pair of nodes and running a shortest-path algorithm is the only way to do this, and the complexity will be O(n) multiplied by the complexity of the algorithm.
But you can cut down on the runtime with a dynamic programming solution.
You could certainly leverage some dynamic programming methods to cut down on the calculation time. For instance, if you are trying to find the shortest path between (A) and (C), and have already calculated the shortest from (B) to (C), then if you happen to encounter (B) while pathfinding from (A), you do not need to recalculate the rest of the cost of that path; it is known.
However, creating a dynamic programming solution of any reasonable complexity will almost certainly be best done in a separate module for Neo4J that is thrown in into a plugin. If what you are doing is a one-time operation or an operation that won't be run frequently, it might be easier to just do the naive solution of calling shortestPath between each pair, but if you plan to be running it fairly frequently on dynamic data, it might be worth authoring a custom plugin. It totally depends on your needs.
No matter what, though, it will take some time to calculate. The dynamic programming solution will cut down on the time greatly (especially in a densely-connected graph), but it will still not be very fast.
What is the end game? Is this a one-time query that resets some property or creates new edges. Or a recurring frequent effort. If it's one-time, you might create edges between the two nodes at each step creating a transitive closure environment. The edge would point between the two nodes and have, as a property, the distance.
Thus, if the path is a>b>c>d, you would create the edges
a>b 1
a>c 2
a>d 3
b>c 1
b>d 2
c>d 1
The edges could be named distinctively to distinguish them from the original path edges. This could create circular paths, which may neither negate this strategy or need a constraint. if you are dealing with directed acyclic graphs it would work well.
I am trying to determine when a food packaging have error or not error. Example
the logo " McDonald's " have error misprints or not, as the wrong label, wrong color..( i can not post picture )
What should I do, please help me!!
It's not a trivial task by any stretch of the imagination. Two images of the same identical object will always be different according to lightning conditions, perspective, shooting angle, etc.
Basically you need to:
1. Process the 2 images into "digested" data - dominant color, shapes, etcw
2. Design and run your own similarity algorithm between the 2 objects
You may want to look at Feature detectors in OpenCV: Surf, SIFT, etc.
Along a result I just found your question, so I think I come too late.
If not I think your problem car easily be resolved, it exists since years and is called Sikuli .
While it's for testing purposes, I have been using it in the same way as you need : compare a reference and a production image. Based on OpenCV it does it very well.
Is there a way to search polygons which are inside another polygon with elasticsearch?
If not, is it possible with Solr or another system?
Totally possible on Elasticsearch:
http://elasticsearch-users.115913.n3.nabble.com/Can-I-use-geo-polygon-filter-to-retrieve-hits-based-on-polygon-fields-td4044079.html
I'm just looking into implementing it myself, my only worry is performance on a high traffic site so we'll see what happens.
As this post was a while ago it would be interesting to know what you ended up doing...
With Solr 4.3 it just became possible; I just finished working on it a couple weeks ago and I'm pretty excited about it. To learn how to use the new Solr 4 spatial field, see: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 What's new is that you can now use the "IsWithin" and "Contains" predicates; there's "IsDisjointTo" too. Based on your question, it's not clear to me which of those you want. Imagine a 3-part sentence in which the first/left part is your index data, then there's the spatial predicate, then there's your query shape. So if you want to search for indexed shapes that are WITHIN your query shape, then use "IsWithin". I was just about to update the wiki to show these predicates.
Pretty sure it's not possible with ES.
With Solr + some plugins I think it's possible, but haven't tried it myself.
Have a look at https://github.com/spatial4j/spatial4j
Shape classes that are geospatially1 aware Shapes: Point, Rectangle,
Circle, Polygon (via JTS) shape intersection logic, yielding:
disjoint, contains, within, intersects bounding box area calculation
It seems spatial4J is already included in Solr. See David Smiley's response (author of Spatial4J and commiter to Solr) in the below link
How to install spatial4j into solr4