Alphabetical ordering in BFS - breadth-first-search

I have troubles differentiating between BFS with alphabetical ordering and BFS without it.
For example, to find a spanning tree in this graph (starting from E).
Starting G
After adding {E,B} and {E,C}
T after added EB and EC
I'm not sure whether to continue adding {B,F} or {C,F}.
Thank you very much.

I'm not sure whether to continue adding {B,F} or {C,F}. Thank you very
much.
Well, the answer depends on the order in which you add the vertices B and C in your queue of BFS algorithm. If you look at the algorithm:
BFS (G, s) //Where G is the graph and s is the Source Node
let Q be queue.
Q.enqueue( s ) //Inserting s in queue until all its neighbour vertices are marked.
mark s as visited.
while ( Q is not empty)
//Removing that vertex from queue,whose neighbour will be visited now
v = Q.dequeue( )
//processing all the neighbours of v
for all neighbours w of v in Graph G
if w is not visited
Q.enqueue( w ) //Stores w in Q to further visit its neighbours
mark w as visited.
Its clear that it does not specify what should be the order in which you enque the neighbours of a vertex.
So if you visit the neighbours of E in the order: B , C , then clearly due to FIFO property of Queue data structure, node B will be dequed(taken out of queue) before C and you will have the edge B--F. If the order is C , B, then the edge would be C--F for similar reasons.
Once you understand the pseudocode, you will understand it very clearly.

Related

Cypher query become very slow on a medium size dataset (with loop)

This question further extends the idea on the question:
Cypher: how to find all the chains of single nodes not repeated?
For example, in a graph like this:
(a1:TestNode)-[:REL]->(r1:Route)-[:REL]->(a2:TestNode)-[:REL]->(s1:Route)-[:REL]->(a1:TestNode)
(a2:TestNode)-[:REL]->(r2:Route)-[:REL]->(a3:TestNode)-[:REL]->(s2:Route)-[:REL]->(a2:TestNode)
(a3:TestNode)-[:REL]->(r3:Route)-[:REL]->(a4:TestNode)-[:REL]->(s3:Route)-[:REL]->(a3:TestNode)
Graphically:
s3 ← a4
↙ ↗
s2 ← a3 → r3
↙ ↗
s1 → a2 → r2
↙ ↗
a1 → r1
Cypher code:
CREATE (a1:TestNode {name:'a1'})-[:REL]->(r1:Route {name:'r1'})-[:REL]->(a2:TestNode {name:'a2'})-[:REL]->(s1:Route {name:'s1'})-[:REL]->(a1),
(a2)-[:REL]->(r2:Route {name:'r2'})-[:REL]->(a3:TestNode {name:'a3'})-[:REL]->(s2:Route {name:'s2'})-[:REL]->(a2),
(a3)-[:REL]->(r3:Route {name:'r3'})-[:REL]->(a4:TestNode {name:'a4'})-[:REL]->(s3:Route {name:'s3'})-[:REL]->(a3)
Afterwards, we can find a route from a4 to a1 by this command:
MATCH p = (a4:TestNode {name: 'a1'})-[r:REL*]->(a1:TestNode {name: 'a4'})
WITH [a4] + nodes(p) AS ns, p
WHERE ALL (n IN ns
WHERE 1=SIZE(FILTER(m IN TAIL(ns)
WHERE m = n)))
RETURN p
Question:
1. If I extend the above create query to have 2,000 'a' nodes, i.e. up to
(a2000)-[:REL]->(r2000:Route {name:'r2000'})-[:REL]->(a2001:TestNode {name:'a2001'})-[:REL]->(s2000:Route {name:'s2000'})-[:REL]->(a2000),
I found that my computer becomes very slow, and 2GB of memory is occupied by neo4j. Is it normal?
Then I want to find a route from a2001 to a1. The system cannot find the solution (which is obvious a2001->a2000->a1999.....->a1). I guess it is because of the loops in between. In the previous question mentioned above, the query should have avoided loops because duplicates are not allowed.
My purpose is to extend this idea such that possible routes between 2 locations can be identified on a connected graph. Many thanks.

Weighing Samples in a Decision Tree

I've constructed a decision tree that takes every sample equally weighted. Now to construct a decision tree which gives different weights to different samples. Is the only change that I need to make is in finding Expected Entropy before calculating information gain. I'm a little confused how to proceed, plz explain....
For example: Consider a node containing p positive node and n negative nodes.So the nodes entropy will be -p/(p+n)log(p/(p+n)) -n/(p+n)log(n/(p+n)). Now if a split is found somehow dividing the parent node in two child nodes.Suppose the child 1 contains p' positives and n' negatives(so child 2 contains p-p' and n-n').Now for child 1 we will calculate entropy as calculated for parent and take the probability of reaching it i.e. (p'+n')/(p+n). Now expected reduction in entropy will be entropy(parent)-(prob of reaching child1*entropy(child1)+prob of reaching child2*entropy(child2)). And the split with max info gain will be chosen.
Now to do this same procedure when we have weights available for each sample.What changes need to be made? What changes need to be made specifically for adaboost(using stumps only)?
(I guess this is the same idea as in some comments, e.g., #Alleo)
Suppose you have p positive examples and n negative examples. Let's denote the weights of examples to be:
a1, a2, ..., ap ---------- weights of the p positive examples
b1, b2, ..., bn ---------- weights of the n negative examples
Suppose
a1 + a2 + ... + ap = A
b1 + b2 + ... + bn = B
As you pointed out, if the examples have unit weights, the entropy would be:
p p n n
- _____ log (____ ) - ______log(______ )
p + n p + n p + n p + n
Now you only need to replace p with A and replace n with B and then you can obtain the new instance-weighted entropy.
A A B B
- _____ log (_____) - ______log(______ )
A + B A + B A + B A + B
Note: nothing fancy here. What we did is just to figure out the weighted importance of the group of positive and negative examples. When examples are equally weighted, the importance of positive examples is proportional to the ratio of positive numbers w.r.t number of all examples. When examples are non-equally weighted, we just perform a weighted average to get the importance of positive examples.
Then you follow the same logic to choose the attribute with largest Information Gain by comparing entropy before splitting and after splitting on an attribute.

Ocaml: Longest Path

I have to make an algorithm for the longest path problem.
I have an oriented weighted graph, a start node, a stop node and a number k.
The algorithm have to say if , on the graph, exist a path from start node to stop node with at least length k.
The true problem is that i have to use the BFS-visit algortihm and not the DFS. On Ocaml the BFS use the Queue and the node are insert on the end of the structure:
let breadth_first_collect graph start =
let rec search visited = function
[] -> visited
| n::rest -> if List.mem n visited
then search visited rest
else search (n::visited) (rest # (succ graph n))
(* new nodes are put into queue *)
in search [] [start];;
Someone can give me some advise, even theorical to make this?
In a BFS you basically shouldn't recurse deeper before you finished current layer. That means that on each step you should take a set of successors, cut the data, and afterwards recurse into each one in a row. Here is a first approximation (untested) of the algorithm:
let breadth_first_collect succ graph start =
let rec search visited v =
let succs = succ graph v |>
List.filter (fun s -> List.mem s visited) in
List.map (search (succs # visited)) succs |> List.concat in
search [] start
So, we first visit all children (aka succs) prepend the to the queue, and the recursively descent into each child in a row.
Again this is a first approximation. Since you need to know the path length it means, that you need to store each path in your queue separately, and can't just have a set of all visited vertices. That means, that your queue must be vertex list list. In that case, you can collect all possible paths, and find if there exists one, that is larger than k.

Can a deterministic finite acceptor begin at the end of string and move toward the start?

If so, how is this drawn as a graph? what would you label your start state? and would you draw the graph as moving from right to left as well?
Since your are dealing with deterministic finite automata, the answer is no.
The main problem is that you may have two transitions (p, a, r) and (q, a, r) leading to the same state r, but with p different from q. Then if you start in r and try to read the letter a backwards, should you end up in p or in q?

automata: using only Equivalence class to proove regularity

I have tried to go about this problem in several ways, and looked in several places with no answer. the question is as follow:
[Question]
Given two regular languages (may be referred to as finitely described languages ,idk) L1 and L2, we define a new language as such:
L = {w1w2| there are two words, x,y such that : xw1 is in L1, w2y is in L2}
I am supposed to use to show that L is regular, however I have the following restrictions:
I must use Equivalence class, and no other way
I cannot use Rank(L), as in show a limit to the number of equivalence class, instead I must show them
I may use the Closure properties that all regular languages hold
I am not expecting a full proof (though that would be appreciated) but an explanation to how to go about such a thing.
thanks in advance.
L = {w1w2| there are two words, x,y such that : xw1 is in L1, w2y is in L2} is regular if L1 and L2 are regular languages.
Lsuff = { w1 | xw1 ∈ L1 }
Lpref = { w2 | w2y ∈ L2 }
And,
L = LsuffLpref
We can easily proof by construction Finite Automata for L.
Suppose Finite Automata(FA) for L1 is M1 and FA for L2 is M2.
[SOLUTION]
Non-Deterministic Finite Automata(NFA) for L can be drawn by introducing NULL-transition (^-edge) form every state in M1 to every state in M2. then NFA can be converted into DFA.
e.g.
L1 = {ab ,ac} and L2 = {12, 13}
L = {ab, ac, 12, 13, a12, a2, ab12, ab2, a13, a3, ab13, ab3, ............}
Note: w1 and w2 can be NULL
M1 =is consist of Q = {q0,q1,qf} with edges:
q0 ---a----->q1,
q1 ---b/c--->qf
Similarly :
M2 =is consist of Q = {p0,p1,pf} with edges:
p0 ---1----->p1,
p1 ---2/3--->pf
Now, NFA for L called M will be consist of Q = {q0,q1,qf, p0,p1,pf} Where Final state of M is pf and edges are:
q0 ---a----->q1,
q1 ---b/c--->qf,
p0 ---1----->p1,
p1 ---2/3--->pf,
q0 ----^----> p0,
q1 ----^----> p0,
qf ----^----> p0,
q0 ----^----> p1,
q1 ----^----> p1,
qf ----^----> p1,
q0 ----^----> pf,
q1 ----^----> pf,
qf ----^----> pf
^ means NULL-Transition.
Now, A NFA can easily convert into DFA.(I leave it for you)
[ANSWER]
DFA for L is possible hence L is Regular Language.
I will highly encourage you to draw DFA/NFA figures, then concept will be clear.>
Note
I am writing this answer, because I believe that the current available doesn't really satisfy the post requirements, i.e.
I must use Equivalence class, and no other way
Answer
A more direct and simple approach is to not construct a DFA/NFA because of time reasons, but to just check if #EquivalenceClasses < ∞ holds. Specifically, you would have the following ones here:
[w1] = {all w1 in L1}
[e]
[w1w2] = L
So ind(R), the index of the equivalence relation, is 3, therefore finite. Hence, L is regular. Q.E.D.
To make it more clear, just have a look at the definition of the equivalence relation for languager, i.e. R_L.
Moreover, regular languages are closed under concatenation. De facto you just need to concatenate the two DFA/NFA's into one.

Resources