Is there any difference between critical path and longest path in DAG? - graph-algorithm

In every book I looked, they say that critical and longest path are the same. The problem is that, on critical path all activities has to be a critical ones. If I was searching for longest path, I wouldn't be paying any attention on whether activities are or aren't critical. Or I don't get something?

consider a graph modeling a project composed of a set of serializable, partially interdependent activities, where the activities are represented by edges and the interdependence by nodes such that 2 edges e1, e2 are incident iff the e1 activity must be completed before the activity e2 can start. assume 2 special vertices s,t representing the start and the end of a project, resp.
in such a model, the critical path describes a maximal sequence of activities that cannot be parallelized wrt each other.
its name stems from the fact that any delay in precisely one of the activities on the critical path necessarily delays the complete project while for all other activities there is some buffer time available.
in particular the critical path does not necessarily match those activities which are essential for an overall success to the project.
the critical path corresponds to the longest path between s, t in the graph.
the critical path need not be unique, of course.

From http://en.wikipedia.org/wiki/Longest_path_problem
The critical path method for scheduling a set of activities involves
the construction of a directed acyclic graph in which the vertices
represent project milestones and the edges represent activities that
must be performed after one milestone and before another; each edge is
weighted by an estimate of the amount of time the corresponding
activity will take to complete. In such a graph, the longest path from
the first milestone to the last one is the critical path, which
describes the total time for completing the project.
They cite Sedgewick, Robert; Wayne, Kevin Daniel (2011), Algorithms (4th ed.), Addison-Wesley Professional, pp. 661–666.

Related

How to specify maximum cost when running BFS Neo4j?

The docs of Neo4j data science library state:
There are multiple termination conditions supported for the traversal,
based on either reaching one of several target nodes, reaching a
maximum depth, exhausting a given budget of traversed relationship
cost, or just traversing the whole graph.
But in the algorithm specific parameters I could not find any parameter for constraining the maximum cost of the traversal (or simply number of relationships if cost is 1). The Only parameters listed are startNodeId, targetNodes and maxDepth.
Any Idea if this actually can be done or if the docs are incorrect?
Here is the list of procedures and functions for your reference. As you can see, Breadth First Search is still in Alpha stage and no estimate function is available yet. You can also see that functions in Beta and Production stages have this function *.estimate. These functions will give you an idea of how much memory will be used when you run those data science related functions. An example of gds.nodeSimilarity.write.estimate can be found below
CALL gds.nodeSimilarity.write.estimate('myGraph', {
writeRelationshipType: 'SIMILAR',
writeProperty: 'score'})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
nodeCount relationshipCount bytesMin bytesMax requiredMemory
9 9 2592 2808 "[2592 Bytes ... 2808 Bytes]"

Calculating a full matrix of shortest path-lengths between all nodes

We are trying to find a way to create a full distance matrix in a neo4j database, where that distance is defined as the length of the shortest path between any two nodes. Of course, there is the shortestPath method but using a loop going through all pairs of nodes and calculating their shortestPaths get very slow. We are explicitely not talking about allShortestPaths, because that returns all shortest paths between 2 specific nodes.
Is there a specific method or approach that is fast for a large number of nodes (>30k)?
Thank you!
j.
There is no easier method; the full distance matrix will take a long time to build.
As you've described it, the full distance matrix must contain the shortest path between any two nodes, which means you will have to get that information at some point. Iterating over each pair of nodes and running a shortest-path algorithm is the only way to do this, and the complexity will be O(n) multiplied by the complexity of the algorithm.
But you can cut down on the runtime with a dynamic programming solution.
You could certainly leverage some dynamic programming methods to cut down on the calculation time. For instance, if you are trying to find the shortest path between (A) and (C), and have already calculated the shortest from (B) to (C), then if you happen to encounter (B) while pathfinding from (A), you do not need to recalculate the rest of the cost of that path; it is known.
However, creating a dynamic programming solution of any reasonable complexity will almost certainly be best done in a separate module for Neo4J that is thrown in into a plugin. If what you are doing is a one-time operation or an operation that won't be run frequently, it might be easier to just do the naive solution of calling shortestPath between each pair, but if you plan to be running it fairly frequently on dynamic data, it might be worth authoring a custom plugin. It totally depends on your needs.
No matter what, though, it will take some time to calculate. The dynamic programming solution will cut down on the time greatly (especially in a densely-connected graph), but it will still not be very fast.
What is the end game? Is this a one-time query that resets some property or creates new edges. Or a recurring frequent effort. If it's one-time, you might create edges between the two nodes at each step creating a transitive closure environment. The edge would point between the two nodes and have, as a property, the distance.
Thus, if the path is a>b>c>d, you would create the edges
a>b 1
a>c 2
a>d 3
b>c 1
b>d 2
c>d 1
The edges could be named distinctively to distinguish them from the original path edges. This could create circular paths, which may neither negate this strategy or need a constraint. if you are dealing with directed acyclic graphs it would work well.

How to find a function that fits a given data set?

The search algorithm is a Breadth first search. I'm not sure how to store terms from and equation into a open list. The function f(x) has the form of ax^e1 + bx^e2 + cx^e3 + k, where a, b, c, are coefficients; k is constant. All exponents, coefficients, and constants are integers between 0 and 5.
Initial state: of the problem solving process should be any term from the ax^e1, bx^e2, cX^e3, k.
The algorithm gradually expands the number of terms in each level of the list.
Not sure how to add the terms to an equation from an open Queue. That is the question.
The general problem that you are dealing belongs to the regression analysis area, and several techniques are available to find a function that fits a given data set, including the popular least squares methods for finding the line of best fit given a dataset (a brief starting point is the related page on wikipedia, but if you want to deepen this topic, you should look at the research paper out there).
If you want to stick with the breadth first search algorithm, although this kind of approach is not common for such a problem, first of all, you need to define all the elements for a search problem, namely (see for more information Chapter 3 of the book of Stuart and Russell, Artificial Intelligence: A Modern Approach):
Initial state: Some initial values for the different terms.
Actions: in your case it should be a change in the different terms. Note that you should discretize the changes in the values.
Transition function: function that determines the new states given a state and an action.
Goal test: a check to recognize whether a state is a goal state or not, and so to terminate the search. There are different ways to define this test in a regression problem. One way is to set a threshold for the sum of the square errors.
Step cost: The cost for an action. In such an abstract problem, probably you can consider the unweighted distance from the initial state on the search graph.
Note that you should carefully think about these elements, as, for example, they determine how efficient your search would be or whether you will have cycles in the search graph.
After you defined all of the elements for the search problem, you basically have to implement:
Node, that contains information about the parent, the state, and the current cost;
Function to expand a given node that returns the successor nodes (according to the transition function, the actions, and the step cost);
Goal test;
The actual search algorithm. In the queue at the beginning you will have the node containing the initial state. After, it is updated with the successor nodes.

A* circular path finding algorithm with restrictions

I have a road map represented as a directed graph of junctions and links leading from one junction to another, each link is weighted with it's own traversal time (the time it takes to cross the link) and im asked to find an algorithm to get from junction A to junction B and back from junction B to junction A so that the total path cost (in time) takes no longer than 10% more time than the optimal path cost (that is the path cost returned by A* algorithm) while keeping the time overlaps of the path to B and the path from B to a minimum, that is if t(x,y) represents the time to cross link (x,y) i need to bring to minimum the sum of t(x,y) + t(y,x) for the links that overlap.
the algorithm should be optimal for the problem at hand and complete (it should also be efficient) and probably use some variants of A* like A*epsilon and the likes...
does anyone have a clue how to go about this problem?
i was thinking of representing the states of this problem as (junction,flag) where flag indicates whether the current node is a part of a path that already passed junction B and the goal state is (A,True) and then using A*epsilon on this... but i don't know how to take into account the time overlap issue.. i guess what im suggesting is not the way im intended to solve this.
any help would be greatly appreciated :)

How does retiming work in systolic arrays?

How does retiming work in systolic arrays (used in signal processors)? I read that there is some notion of negative delay which is used, but how can a delay be negative and if that is just an abstraction then how does it help?
The basic model of retiming is that you have wavefronts of registers interconnected by a bunch of combinational logic, and you are improving the timing or area of the resulting circuit by repositioning the registers at different points in the circuit such that every path through the logic still goes through the same number of registers. For a simple example, lets say that you have an AND gate feeding a register, the longest path to the input of the register is 12ns, the longest path from the output of the register is 6ns, the delay of the AND gate is 3ns, and you need to get the clock cycle time down to 10ns. You could achieve this by deleting the register and replacing it with two registers, one at each input of the AND gate, clocked by the same clock as the original register. Now you have reduced the longest input path to 9ns, expanded the output path to 9ns, and met your clock cycle objective. In effect, you have added -3ns to the effective arrival time at the register (and added +3 ns to the effective output time).
A modified version of Leiserson and Saxe's original paper on retiming is available here. Wikipedia has a decent, though short, article on the subject with a few links. If you have access to IEEE Xplore or the ACM Digital Library, a search through the proceedings of the Design Automation Conference or the International Conference on Computer-Aided Design looking for retiming should yield lots of articles - this has been an active research area for years.

Resources