How to use segment tree and scanline - segment-tree

Given 300000 segments.
Consider any pair of segments: a = [l1,r1] and b = [l2,r2].
If l2 >= l1 and r2 <= r1 , it is "good" pair.
If a == b, it is "bad" pair.
Overwise, it is "bad" pair.
How to find number of all "good" pairs among given segments using segment tree and scanline?

Sort the segments in increasing order with respect to their l-values and for pairs with same l-values sort them in decreasing order with respect to their r-value.
Suppose for a particular , you want to count the number of good pairs (ai,aj) such that j < i. Let ai=[l1,r1] and aj = [l2,r2]. Then we have l2 <= l1. Now we need to count all the possible values of j such that r2 <= r1. This can be done by maintaining a segment tree for the values of r for all j such that 0 < j < i. After querying for the i-th pair, update the segment tree with the r-value of the i-th segment.
Coming to segment tree part, build a segment tree on the values of r. On updating a value of r in segment tree, add 1 to the value of r in the segment tree and for querying for a particular value of r, query for sum in the range [0,r-1]. This will give total number of segments that fit good with the given segment.
If the values of r are big that would not fit into segment tree, then apply coordinate compression to values first and then use segment tree for the compressed values.

Related

How can I modify Dijkstra's algorithm to get the longest path most of the time?

I know that finding the longest path is an NP Hard problem.
What was asked from us was to change Dijkstra's algorithm to find the longest path by adding another parameter to the algorithm. A parameter like the distance from the source to the given vertex, the vertex predecessors, successors, number of edges...For example, we could extract the vertex from the queue depending on a parameter other than the max distance or we could another queue...
What we did first was to change the initialization so that all vertices distances = 0 except the source node, that = infinity. Then we extracted from the queue of vertices the one with the biggest distance. Then, we inverted the relaxation sign, so that the vertex saves the distance if it is bigger that its current distance.
What parameter could I add that would improve Dijkstra's performance in finding a longest path? It doesn't have to work 100% of the time.
This is Dijkstra's algorithm:
ShortestPath(G, v)
init D array entries to infinity
D[v]=0
add all vertices to priority queue Q
while Q not empty do
u = Q.removeMin()
for each neighbor, z, of u in Q do
if D[u] + w(u,z) < D[z] then
D[z] = D[u] + w(u,z)
Change key of z in Q to D[z]
return D as shortest path lengths
This is our modified version:
ShortestPath(G, v)
init D array entries to 0
D[v]=infinity //source vertex
add all vertices to priority queue Q
while Q not empty do
u = Q.removeMax()
for each neighbor, z, of u in Q do
if D[u] + w(u,z) > D[z] then
D[z] = D[u] + w(u,z)
Change key of z in Q to D[z]
return D as shortest path lengths

minimize the maximum continious subarray in array of 0/1

Algo question
Binary array of 0/1 given
In one operation i can flip any array[index] of array i.e. 0->1 or 1->0
so aim is to minimize the maximum lenth of continious 1's or 0's by using atmost k flips
eg if 11111 if array and k=1 ,best is to make array as 11011
And minimized value of maximum continous 1's or 0's is 2
for 111110111111 and k=3 ans is 2
I tried Brute Force (by trying various position flips) but its not efficient
I think Greedy ,but can not figure out exactly
can you please help me for algo,O(n) or similar
A solution could be devised by reading each bit in order and recording the size of each continuous group of 1 into a list A.
Once you are done filling A, you can follow the algorithm narrated by the pseudocode below:
result = N
for i = 1 to N
flips_needed = 0
for a in A:
flips_needed += <number of flips needed to make sure largest group remaining in a is of size i>
if k >= flips_needed:
result = flips_needed
break
return result
N is the number of bits in the entire initial sequence.
The algorithm above works by dividing the groups of 1 into sizes of at most i. Whenever doing that requires <= k, we have the result we are looking for, as i starts from 1 and goes up. (i.e. when we found flips_needed <= k, we know the groups of 1 are as minimal as they can get)

minimum weight shortest path for all pairs of vertices with exactly k blue edges

Basically a subset of the graph has edges which are blue
So I know how to find all pairs shortest paths with DP in O(n^3), but how do I account for color and the exactness of the number of edges required for the problem?
This can be done in O(k^2 . n^3) using a variant of Floydd-Warshall.
Instead of keeping track of the minimum path weight d(i,j) between two nodes i and j, you keep track of the minimum path weight d(i,j,r) for paths between i and j with exactly r blue edges, for 0 ≤ r ≤ k.
The update step when examining paths through a node m, where normally d(i,j) would be updated with the sum of d(i,m) and d(m,j) if it is smaller, becomes:
for u: 0 .. k
for v: 0 .. (k-u)
s = d(i,m,u) + d(m,j,v)
d(i,j,u+v) = min( s, d(i,j,u+v) )
At the end, you then read off d(i,j,k).

constraint programming mesh network

I have a mesh network as shown in figure.
Now, I am allocating values to all edges in this sat network. I want to propose in my program that, there are no closed loops in my allocation. For example the constraint for top-left most square can be written as -
E0 = 0 or E3 = 0 or E4 = 0 or E7 = 0, so either of the link has to be inactive in order not to form a loop. However, in this kind of network, there are many possible loops.
For example loop formed by edges - E0, E3, E7, E11, E15, E12, E5, E1.
Now my problem is that I have to describe each possible combination of loop which can occur in this network. I tried to write constraints in one possible formula, however I was not able to succeed.
Can anyone throw any pointers if there is a possible way to encode this situation?
Just for information, I am using Z3 Sat Solver.
The following encoding can be used with any graph with N nodes and M edges. It is using (N+1)*M variables and 2*M*M 3-SAT clauses. This ipython notebook demonstrates the encoding by comparing the SAT solver results (UNSAT when there is a loop, SAT otherwise) with the results of a straight-forward loop finding algorithm.
Disclaimer: This encoding is my ad-hoc solution to the problem. I'm pretty sure that it is correct but I don't know how it compares performance-wise to other encodings for this problem. As my solution works with any graph it is to be expected that a better solution exists that uses some of the properties of the class of graphs the OP is interested in.
Variables:
I have one variable for each edge. The edge is "active" or "used" if its corresponding variable is set. In my reference implementation the edges have indices 0..(M-1) and this variables have indices 1..M:
def edge_state_var(edge_idx):
assert 0 <= edge_idx < M
return 1 + edge_idx
Then I have an M bits wide state variable for each edge, or a total of N*M state bits (nodes and bits are also using zero-based indexing):
def node_state_var(node_idx, bit_idx):
assert 0 <= node_idx < N
assert 0 <= bit_idx < M
return 1 + M + node_idx*M + bit_idx
Clauses:
When an edge is active, it links the state variables of the two nodes it connects together. The state bits with the same index as the node must be different on both sides and the other state bits must be equal to their corresponding partner on the other node. In python code:
# which edge connects which nodes
connectivity = [
( 0, 1), # edge E0
( 1, 2), # edge E1
( 2, 3), # edge E2
( 0, 4), # edge E3
...
]
cnf = list()
for i in range(M):
eb = edge_state_var(i)
p, q = connectivity[i]
for k in range(M):
pb = node_state_var(p, k)
qb = node_state_var(q, k)
if k == i:
# eb -> (pb != qb)
cnf.append([-eb, -pb, -qb])
cnf.append([-eb, +pb, +qb])
else:
# eb -> (pb == qb)
cnf.append([-eb, -pb, +qb])
cnf.append([-eb, +pb, -qb])
So basically each edge tries to segment the graph it is part of into a half that is on one side of the edge and has all the state bits corresponding to the edge set to 1 and a half that is on the other side of the edge and has the state bits corresponding to the edge set to 0. This is not possible for a loop where all nodes in the loop can be reached from both sides of each edge in the loop.

Need a specific example of U-Matrix in Self Organizing Map

I'm trying to develop an application using SOM in analyzing data. However, after finishing training, I cannot find a way to visualize the result. I know that U-Matrix is one of the method but I cannot understand it properly. Hence, I'm asking for a specific and detail example how to construct U-Matrix.
I also read an answer at U-matrix and self organizing maps but it only refers to 1 row map, how about 3x3 map? I know that for 3x3 map:
m(1) m(2) m(3)
m(4) m(5) m(6)
m(7) m(8) m(9)
a 5x5 matrix must me created:
u(1) u(1,2) u(2) u(2,3) u(3)
u(1,4) u(1,2,4,5) u(2,5) u(2,3,5,6) u(3,6)
u(4) u(4,5) u(5) u(5,6) u(6)
u(4,7) u(4,5,7,8) u(5,8) u(5,6,8,9) u(6,9)
u(7) u(7,8) u(8) u(8,9) u(9)
but I don't know how to calculate u-weight u(1,2,4,5), u(2,3,5,6), u(4,5,7,8) and u(5,6,8,9).
Finally, after constructing U-Matrix, is there any way to visualize it using color, e.g. heat map?
Thank you very much for your time.
Cheers
I don't know if you are still interested in this but I found this link
http://www.uni-marburg.de/fb12/datenbionik/pdf/pubs/1990/UltschSiemon90
which explains very speciffically how to calculate the U-matrix.
Hope it helps.
By the way, the site were I found the link has several resources referring to SOMs I leave it here in case anyone is interested:
http://www.ifs.tuwien.ac.at/dm/somtoolbox/visualisations.html
The essential idea of a Kohonen map is that the data points are mapped to a
lattice, which is often a 2D rectangular grid.
In the simplest implementations, the lattice is initialized by creating a 3D
array with these dimensions:
width * height * number_features
This is the U-matrix.
Width and height are chosen by the user; number_features is just the number
of features (columns or fields) in your data.
Intuitively this is just creating a 2D grid of dimensions w * h
(e.g., if w = 10 and h = 10 then your lattice has 100 cells), then
into each cell, placing a random 1D array (sometimes called "reference tuples")
whose size and values are constrained by your data.
The reference tuples are also referred to as weights.
How is the U-matrix rendered?
In my example below, the data is comprised of rgb tuples, so the reference tuples
have length of three and each of the three values must lie between 0 and 255).
It's with this 3D array ("lattice") that you begin the main iterative loop
The algorithm iteratively positions each data point so that it is closest to others similar to it.
If you plot it over time (iteration number) then you can visualize cluster
formation.
The plotting tool i use for this is the brilliant Python library, Matplotlib,
which plots the lattice directly, just by passing it into the imshow function.
Below are eight snapshots of the progress of a SOM algorithm, from initialization to 700 iterations. The newly initialized (iteration_count = 0) lattice is rendered in the top left panel; the result from the final iteration, in the bottom right panel.
Alternatively, you can use a lower-level imaging library (in Python, e.g., PIL) and transfer the reference tuples onto the 2D grid, one at a time:
for y in range(h):
for x in range(w):
img.putpixel( (x, y), (
SOM.Umatrix[y, x, 0],
SOM.Umatrix[y, x, 1],
SOM.Umatrix[y, x, 2])
)
Here img is an instance of PIL's Image class. Here the image is created by iterating over the grid one pixel at a time; for each pixel, putpixel is called on img three times, the three calls of course corresponding to the three values in an rgb tuple.
From the matrix that you create:
u(1) u(1,2) u(2) u(2,3) u(3)
u(1,4) u(1,2,4,5) u(2,5) u(2,3,5,6) u(3,6)
u(4) u(4,5) u(5) u(5,6) u(6)
u(4,7) u(4,5,7,8) u(5,8) u(5,6,8,9) u(6,9)
u(7) u(7,8) u(8) u(8,9) u(9)
The elements with single numbers like u(1), u(2), ..., u(9) as just the elements with more than two numbers like u(1,2,4,5), u(2,3,5,6), ... , u(5,6,8,9) are calculated using something like the mean, median, min or max of the values in the neighborhood.
It's a nice idea calculate the elements with two numbers first, one possible code for that is:
for i in range(self.h_u_matrix):
for j in range(self.w_u_matrix):
nb = (0,0)
if not (i % 2) and (j % 2):
nb = (0,1)
elif (i % 2) and not (j % 2):
nb = (1,0)
self.u_matrix[(i,j)] = np.linalg.norm(
self.weights[i //2, j //2] - self.weights[i //2 +nb[0], j // 2 + nb[1]],
axis = 0
)
In the code above the self.h_u_matrix = self.weights.shape[0]*2 - 1 and self.w_u_matrix = self.weights.shape[1]*2 - 1 are the dimensions of the U-Matrix. With that said, for calculate the others elements it's necessary obtain a list with they neighboors and apply a mean for example. The following code implements that's idea:
for i in range(self.h_u_matrix):
for j in range(self.w_u_matrix):
if not (i % 2) and not (j % 2):
nodelist = []
if i > 0:
nodelist.append((i-1,j))
if i < 4:
nodelist.append((i+1, j))
if j > 0:
nodelist.append((i,j -1))
if j < 4:
nodelist.append((i,j+1))
meanlist = [self.u_matrix[u_node] for u_node in nodelist]
self.u_matrix[(i,j)] = np.mean(meanlist)
elif (i % 2) and (j % 2):
meanlist = [
(i - 1, j),
(i + 1, j),
(i, j - 1),
(i, j + 1)]
self.u_matrix[(i,j)] = np.mean(meanlist)

Resources